Cooperative Object Transportation Using Curriculum-Based Deep Reinforcement Learning

Gyuho Eoh; Tae-Hyoung Park

doi:10.3390/s21144780

Cooperative Object Transportation Using Curriculum-Based Deep Reinforcement Learning

Sensors ◽

10.3390/s21144780 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4780

Author(s):

Gyuho Eoh ◽

Tae-Hyoung Park

Keyword(s):

Reinforcement Learning ◽

Controller Design ◽

Success Probability ◽

Region Growing ◽

Long Time ◽

Improved Performance ◽

Time To Learn ◽

Object Transportation ◽

Efficient Learning ◽

Multi Robot

This paper presents a cooperative object transportation technique using deep reinforcement learning (DRL) based on curricula. Previous studies on object transportation highly depended on complex and intractable controls, such as grasping, pushing, and caging. Recently, DRL-based object transportation techniques have been proposed, which showed improved performance without precise controller design. However, DRL-based techniques not only take a long time to learn their policies but also sometimes fail to learn. It is difficult to learn the policy of DRL by random actions only. Therefore, we propose two curricula for the efficient learning of object transportation: region-growing and single- to multi-robot. During the learning process, the region-growing curriculum gradually extended to a region in which an object was initialized. This step-by-step learning raised the success probability of object transportation by restricting the working area. Multiple robots could easily learn a new policy by exploiting the pre-trained policy of a single robot. This single- to multi-robot curriculum can help robots to learn a transporting method with trial and error. Simulation results are presented to verify the proposed techniques.

Download Full-text

Decentralized Control of Multi-Robot System in Cooperative Object Transportation Using Deep Reinforcement Learning

IEEE Access ◽

10.1109/access.2020.3025287 ◽

2020 ◽

Vol 8 ◽

pp. 184109-184119

Author(s):

Lin Zhang ◽

Yufeng Sun ◽

Andrew Barth ◽

Ou Ma

Keyword(s):

Reinforcement Learning ◽

Decentralized Control ◽

Robot System ◽

Object Transportation ◽

Multi Robot

Download Full-text

Distributed Reinforcement Learning for Multi-robot Decentralized Collective Construction

Distributed Autonomous Robotic Systems - Springer Proceedings in Advanced Robotics ◽

10.1007/978-3-030-05816-6_3 ◽

2019 ◽

pp. 35-49 ◽

Cited By ~ 10

Author(s):

Guillaume Sartoretti ◽

Yue Wu ◽

William Paivine ◽

T. K. Satish Kumar ◽

Sven Koenig ◽

...

Keyword(s):

Reinforcement Learning ◽

Collective Construction ◽

Distributed Reinforcement ◽

Multi Robot

Download Full-text

Multi-Robot Collision Avoidance with Map-based Deep Reinforcement Learning

2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) ◽

10.1109/ictai50040.2020.00088 ◽

2020 ◽

Author(s):

Shunyi Yao ◽

Guangda Chen ◽

Lifan Pan ◽

Jun Ma ◽

Jianmin Ji ◽

...

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Multi Robot

Download Full-text

Generation of multiagent animation for object transportation using deep reinforcement learning and blend‐trees

Computer Animation and Virtual Worlds ◽

10.1002/cav.2017 ◽

2021 ◽

Author(s):

Shao‐Chieh Chen ◽

Guan‐Ting Liu ◽

Sai‐Keung Wong

Keyword(s):

Reinforcement Learning ◽

Object Transportation

Download Full-text

Reinforcement-Learning-Based Asynchronous Formation Control Scheme for Multiple Unmanned Surface Vehicles

Applied Sciences ◽

10.3390/app11020546 ◽

2021 ◽

Vol 11 (2) ◽

pp. 546

Author(s):

Jiajia Xie ◽

Rui Zhou ◽

Yuan Liu ◽

Jun Luo ◽

Shaorong Xie ◽

...

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Rapid Development ◽

Gradient Algorithm ◽

Robot System ◽

Physical Relationship ◽

Unmanned Surface Vehicles ◽

Main Challenge ◽

Control Scheme ◽

Multi Robot

The high performance and efficiency of multiple unmanned surface vehicles (multi-USV) promote the further civilian and military applications of coordinated USV. As the basis of multiple USVs’ cooperative work, considerable attention has been spent on developing the decentralized formation control of the USV swarm. Formation control of multiple USV belongs to the geometric problems of a multi-robot system. The main challenge is the way to generate and maintain the formation of a multi-robot system. The rapid development of reinforcement learning provides us with a new solution to deal with these problems. In this paper, we introduce a decentralized structure of the multi-USV system and employ reinforcement learning to deal with the formation control of a multi-USV system in a leader–follower topology. Therefore, we propose an asynchronous decentralized formation control scheme based on reinforcement learning for multiple USVs. First, a simplified USV model is established. Simultaneously, the formation shape model is built to provide formation parameters and to describe the physical relationship between USVs. Second, the advantage deep deterministic policy gradient algorithm (ADDPG) is proposed. Third, formation generation policies and formation maintenance policies based on the ADDPG are proposed to form and maintain the given geometry structure of the team of USVs during movement. Moreover, three new reward functions are designed and utilized to promote policy learning. Finally, various experiments are conducted to validate the performance of the proposed formation control scheme. Simulation results and contrast experiments demonstrate the efficiency and stability of the formation control scheme.

Download Full-text

Selective network discovery via deep reinforcement learning on embedded spaces

Applied Network Science ◽

10.1007/s41109-021-00365-8 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Peter Morales ◽

Rajmonda Sulo Caceres ◽

Tina Eliassi-Rad

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Sequential Decision ◽

Network Discovery ◽

Learning Tasks ◽

Partially Observed ◽

Decision Making Problem ◽

Resource Collection ◽

Improved Performance ◽

Discovery Algorithms

AbstractComplex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstream learning tasks given resource collection constraints are of great interest. In this paper, we formulate the task-specific network discovery problem as a sequential decision-making problem. Our downstream task is selective harvesting, the optimal collection of vertices with a particular attribute. We propose a framework, called network actor critic (NAC), which learns a policy and notion of future reward in an offline setting via a deep reinforcement learning algorithm. The NAC paradigm utilizes a task-specific network embedding to reduce the state space complexity. A detailed comparative analysis of popular network embeddings is presented with respect to their role in supporting offline planning. Furthermore, a quantitative study is presented on various synthetic and real benchmarks using NAC and several baselines. We show that offline models of reward and network discovery policies lead to significantly improved performance when compared to competitive online discovery algorithms. Finally, we outline learning regimes where planning is critical in addressing sparse and changing reward signals.

Download Full-text

Regional stabilization and H∞ congestion control with input saturation

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331221992739 ◽

2021 ◽

pp. 014233122199273

Author(s):

Sadek Belamfedel Alaoui ◽

El Houssaine Tissir ◽

Noreddine Chaibi ◽

Fatima El Haoussi

Keyword(s):

Linear Systems ◽

Controller Design ◽

Input Saturation ◽

Domain Of Attraction ◽

Queue Management ◽

Variable Delay ◽

Challenging Problem ◽

Regional Stabilization ◽

Small Gain ◽

Improved Performance

Designing robust active queue management subjected to network imperfections is a challenging problem. Motivated by this topic, we addressed the problem of controller design for linear systems with variable delay and unsymmetrical constraints by the scaled small gain theorem. We designed two mechanisms: robust enhanced proportional derivative; and robust enhanced proportional derivative subjected to input saturation. Discussion of their practical implementations along with extensive comparisons by MATLAB and NS3 illustrate the improved performance and the enlargement of the domain of attraction regarding some literature results.

Download Full-text

Reinforcement Learning Based Multi-robot Formation Control Under Separation Bearing Orientation Scheme

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9327315 ◽

2020 ◽

Author(s):

Zichen He ◽

Lu Dong ◽

Changyin Sun ◽

Jiawei Wang

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Multi Robot

Download Full-text

Cooperative multi-robot object transportation system based on hierarchical quadratic programming

IEEE Robotics and Automation Letters ◽

10.1109/lra.2021.3092305 ◽

2021 ◽

pp. 1-1

Author(s):

Daravuth Koung ◽

Olivier Kermorgant ◽

Isabelle Fantoni ◽

Lamia Belouaer

Keyword(s):

Quadratic Programming ◽

Transportation System ◽

Object Transportation ◽

Multi Robot

Download Full-text

Multi-robot Information Sampling Using Deep Mean Field Reinforcement Learning

10.1109/smc52423.2021.9658795 ◽

2021 ◽

Author(s):

Tuffa Said ◽

Jeffery Wolbert ◽

Siavash Khodadadeh ◽

Ayan Dutta ◽

O. Patrick Kreidl ◽

...

Keyword(s):

Reinforcement Learning ◽

Mean Field ◽

Information Sampling ◽

Multi Robot

Download Full-text