scholarly journals Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV’s Autonomous Motion Planning in Complex Unknown Environments

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1890 ◽  
Author(s):  
Zijian Hu ◽  
Kaifang Wan ◽  
Xiaoguang Gao ◽  
Yiwei Zhai ◽  
Qianglong Wang

Autonomous motion planning (AMP) of unmanned aerial vehicles (UAVs) is aimed at enabling a UAV to safely fly to the target without human intervention. Recently, several emerging deep reinforcement learning (DRL) methods have been employed to address the AMP problem in some simplified environments, and these methods have yielded good results. This paper proposes a multiple experience pools (MEPs) framework leveraging human expert experiences for DRL to speed up the learning process. Based on the deep deterministic policy gradient (DDPG) algorithm, a MEP–DDPG algorithm was designed using model predictive control and simulated annealing to generate expert experiences. On applying this algorithm to a complex unknown simulation environment constructed based on the parameters of the real UAV, the training experiment results showed that the novel DRL algorithm resulted in a performance improvement exceeding 20% as compared with the state-of-the-art DDPG. The results of the experimental testing indicate that UAVs trained using MEP–DDPG can stably complete a variety of tasks in complex, unknown environments.

Author(s):  
Óscar Pérez-Gil ◽  
Rafael Barea ◽  
Elena López-Guillén ◽  
Luis M. Bergasa ◽  
Carlos Gómez-Huélamo ◽  
...  

AbstractNowadays, Artificial Intelligence (AI) is growing by leaps and bounds in almost all fields of technology, and Autonomous Vehicles (AV) research is one more of them. This paper proposes the using of algorithms based on Deep Learning (DL) in the control layer of an autonomous vehicle. More specifically, Deep Reinforcement Learning (DRL) algorithms such as Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) are implemented in order to compare results between them. The aim of this work is to obtain a trained model, applying a DRL algorithm, able of sending control commands to the vehicle to navigate properly and efficiently following a determined route. In addition, for each of the algorithms, several agents are presented as a solution, so that each of these agents uses different data sources to achieve the vehicle control commands. For this purpose, an open-source simulator such as CARLA is used, providing to the system with the ability to perform a multitude of tests without any risk into an hyper-realistic urban simulation environment, something that is unthinkable in the real world. The results obtained show that both DQN and DDPG reach the goal, but DDPG obtains a better performance. DDPG perfoms trajectories very similar to classic controller as LQR. In both cases RMSE is lower than 0.1m following trajectories with a range 180-700m. To conclude, some conclusions and future works are commented.


2020 ◽  
Vol 17 (1) ◽  
pp. 172988141989834
Author(s):  
Guoyu Zuo ◽  
Qishen Zhao ◽  
Jiahao Lu ◽  
Jiangeng Li

The goal of reinforcement learning is to enable an agent to learn by using rewards. However, some robotic tasks naturally specify with sparse rewards, and manually shaping reward functions is a difficult project. In this article, we propose a general and model-free approach for reinforcement learning to learn robotic tasks with sparse rewards. First, a variant of Hindsight Experience Replay, Curious and Aggressive Hindsight Experience Replay, is proposed to improve the sample efficiency of reinforcement learning methods and avoid the need for complicated reward engineering. Second, based on Twin Delayed Deep Deterministic policy gradient algorithm, demonstrations are leveraged to overcome the exploration problem and speed up the policy training process. Finally, the action loss is added into the loss function in order to minimize the vibration of output action while maximizing the value of the action. The experiments on simulated robotic tasks are performed with different hyperparameters to verify the effectiveness of our method. Results show that our method can effectively solve the sparse reward problem and obtain a high learning speed.


2021 ◽  
pp. 1-11
Author(s):  
Xindi Yang ◽  
Hao Zhang ◽  
Zhuping Wang ◽  
Huaicheng Yan ◽  
Changzhu Zhang

Author(s):  
Chengmin Zhou ◽  
Bingding Huang ◽  
Pasi Fränti

AbstractPrinciples of typical motion planning algorithms are investigated and analyzed in this paper. These algorithms include traditional planning algorithms, classical machine learning algorithms, optimal value reinforcement learning, and policy gradient reinforcement learning. Traditional planning algorithms investigated include graph search algorithms, sampling-based algorithms, interpolating curve algorithms, and reaction-based algorithms. Classical machine learning algorithms include multiclass support vector machine, long short-term memory, Monte-Carlo tree search and convolutional neural network. Optimal value reinforcement learning algorithms include Q learning, deep Q-learning network, double deep Q-learning network, dueling deep Q-learning network. Policy gradient algorithms include policy gradient method, actor-critic algorithm, asynchronous advantage actor-critic, advantage actor-critic, deterministic policy gradient, deep deterministic policy gradient, trust region policy optimization and proximal policy optimization. New general criteria are also introduced to evaluate the performance and application of motion planning algorithms by analytical comparisons. The convergence speed and stability of optimal value and policy gradient algorithms are specially analyzed. Future directions are presented analytically according to principles and analytical comparisons of motion planning algorithms. This paper provides researchers with a clear and comprehensive understanding about advantages, disadvantages, relationships, and future of motion planning algorithms in robots, and paves ways for better motion planning algorithms in academia, engineering, and manufacturing.


Sign in / Sign up

Export Citation Format

Share Document