金莎优惠大厅_金沙优惠自助申请大厅

首页学院概况制度文件教学管理科学研究师资队伍学生工作党建阵地校友之窗English
站内搜索:
首页
 通知公告 
 学院动态 
 学工动态 
 图片文章 
 学术活动 
 留言板 
 资料下载 
 
当前位置: 首页>>学术活动>>正文
 
A Sample Efficient Model-based Deep Reinforcement Learning Algorithm with Experience Replay for Robot Manipulation
2020-12-28 09:01  

报告题目: A Sample Efficient Model-based Deep Reinforcement Learning Algorithm with Experience Replay for Robot Manipulation


报告时间:2020年12月29日 15:10


报告地点:zoom在线会议  会议ID939 3075 5234   密码: nMsa4a


报告对象:欢迎感兴趣的老师、本科生、研究生参加!


报告内容:

For robot manipulation, reinforcement learning has provided an effective end to end approach in controlling the complicated dynamic system.  Model-free reinforcement learning methods ignore the model of system dynamics and are limited to simple behavior control. By contrast, model-based methods can quickly reach optimal trajectory planning by building a dynamic system model. However, it is not easy to build an accurate and efficient system model with high generalization ability, especially when facing complex dynamic system and various manipulation tasks. Furthermore, when the rewards provided by the environment are sparse, the agent will also lose effective guidance and fail to optimize the policy efficiently, which result in considerably decreased sample efficiency. In this paper, a model-based deep reinforcement learning algorithm, in which a deep neural network model is utilized to simulate the system dynamics, is proposed for robot manipulation. The proposed deep neural network model is robust enough to deal with complex control tasks and possesses the generalization ability. Moreover, a curiosity-based experience replay method is incorporated to solve the sparse reward problem and improve the sample efficiency in reinforcement learning. The agent who manipulates a robotic hand, will be encouraged to explore optimal trajectories according to the failure experience. Simulation experiment results show the effectiveness of proposed method. Various manipulation tasks are achieved successfully in such a complex dynamic system and the sample efficiency gets improved even in a sparse reward environment, as the learning time gets reduced greatly.


报告人简介: 张成 副教授 博导 日本茨城大学

 Cheng ZHANG received his Ph.D. degree from Waseda University, Tokyo, Japan, in 2015. From 2008 to 2015, he was a research engineer at Sony Digital Network Applications, Japan and HGST Japan, Inc. (ex Hitachi Global Storage Technologies), where he researched and developed control algorithms for image stabilization module of Sony digital camera, and servo control algorithms for next generation high capacity HDD. From 2015 to 2020, he was an assistant professor of Graduate Program for Embodiment Informatics (Program for Leading Graduate School) at Graduate School of Fundamental Science and Engineering, Waseda University. He is currently an assistant professor of Department of Mechanical System Engineering, Ibaraki University, Ibaraki, Japan. His research interests include communication network engineering, machine control algorithm, IoT robot, reinforcement learning, and embedded software. He received the IEICE Young Researcher's Award in 2013. He is a member of IEICE, IEEE and ACM.


关闭窗口

金莎优惠大厅 地址:安徽省马鞍山市马向路安徽工业大学(秀山校区)逸夫楼
电话:0555-2315538, 传真:0555-2315538 邮编:243032