scholarly journals Deep Reinforcement Learning for Indoor Mobile Robot Path Planning

Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5493
Author(s):  
Junli Gao ◽  
Weijie Ye ◽  
Jing Guo ◽  
Zhongjuan Li

This paper proposes a novel incremental training mode to address the problem of Deep Reinforcement Learning (DRL) based path planning for a mobile robot. Firstly, we evaluate the related graphic search algorithms and Reinforcement Learning (RL) algorithms in a lightweight 2D environment. Then, we design the algorithm based on DRL, including observation states, reward function, network structure as well as parameters optimization, in a 2D environment to circumvent the time-consuming works for a 3D environment. We transfer the designed algorithm to a simple 3D environment for retraining to obtain the converged network parameters, including the weights and biases of deep neural network (DNN), etc. Using these parameters as initial values, we continue to train the model in a complex 3D environment. To improve the generalization of the model in different scenes, we propose to combine the DRL algorithm Twin Delayed Deep Deterministic policy gradients (TD3) with the traditional global path planning algorithm Probabilistic Roadmap (PRM) as a novel path planner (PRM+TD3). Experimental results show that the incremental training mode can notably improve the development efficiency. Moreover, the PRM+TD3 path planner can effectively improve the generalization of the model.

Author(s):  
Jie Zhong ◽  
Tao Wang ◽  
Lianglun Cheng

AbstractIn actual welding scenarios, an effective path planner is needed to find a collision-free path in the configuration space for the welding manipulator with obstacles around. However, as a state-of-the-art method, the sampling-based planner only satisfies the probability completeness and its computational complexity is sensitive with state dimension. In this paper, we propose a path planner for welding manipulators based on deep reinforcement learning for solving path planning problems in high-dimensional continuous state and action spaces. Compared with the sampling-based method, it is more robust and is less sensitive with state dimension. In detail, to improve the learning efficiency, we introduce the inverse kinematics module to provide prior knowledge while a gain module is also designed to avoid the local optimal policy, we integrate them into the training algorithm. To evaluate our proposed planning algorithm in multiple dimensions, we conducted multiple sets of path planning experiments for welding manipulators. The results show that our method not only improves the convergence performance but also is superior in terms of optimality and robustness of planning compared with most other planning algorithms.


Author(s):  
K. A. A. Mustafa ◽  
N. Botteghi ◽  
B. Sirmacek ◽  
M. Poel ◽  
S. Stramigioli

<p><strong>Abstract.</strong> We introduce a new autonomous path planning algorithm for mobile robots for reaching target locations in an unknown environment where the robot relies on its on-board sensors. In particular, we describe the design and evaluation of a deep reinforcement learning motion planner with continuous linear and angular velocities to navigate to a desired target location based on deep deterministic policy gradient (DDPG). Additionally, the algorithm is enhanced by making use of the available knowledge of the environment provided by a grid-based SLAM with Rao-Blackwellized particle filter algorithm in order to shape the reward function in an attempt to improve the convergence rate, escape local optima and reduce the number of collisions with the obstacles. A comparison is made between a reward function shaped based on the map provided by the SLAM algorithm and a reward function when no knowledge of the map is available. Results show that the required learning time has been decreased in terms of number of episodes required to converge, which is 560 episodes compared to 1450 episodes in the standard RL algorithm, after adopting the proposed approach and the number of obstacle collision is reduced as well with a success ratio of 83% compared to 56% in the standard RL algorithm. The results are validated in a simulated experiment on a skid-steering mobile robot.</p>


Author(s):  
Prases K. Mohanty ◽  
Dayal R. Parhi

In this article a new optimal path planner for mobile robot navigation based on invasive weed optimization (IWO) algorithm has been addressed. This ecologically inspired algorithm is based on the colonizing property of weeds and distribution. A new fitness function has been formed between robot to goal and obstacles, which satisfied the conditions of both obstacle avoidance and target seeking behavior in robot present in the environment. Depending on the fitness function value of each weed in the colony the robot that avoids obstacles and navigating towards goal. The optimal path is generated with this developed algorithm when the robot reaches its destination. The effectiveness, feasibility, and robustness of the proposed navigational algorithm has been performed through a series of simulation and experimental results. The results obtained from the proposed algorithm has been also compared with other intelligent algorithms (Bacteria foraging algorithm and Genetic algorithm) to show the adaptability of the developed navigational method. Finally, it has been concluded that the proposed path planning algorithm can be effectively implemented in any kind of complex environments.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 796
Author(s):  
Xiaoqiang Yu ◽  
Ping Wang ◽  
Zexu Zhang

Path planning is an essential technology for lunar rover to achieve safe and efficient autonomous exploration mission, this paper proposes a learning-based end-to-end path planning algorithm for lunar rovers with safety constraints. Firstly, a training environment integrating real lunar surface terrain data was built using the Gazebo simulation environment and a lunar rover simulator was created in it to simulate the real lunar surface environment and the lunar rover system. Then an end-to-end path planning algorithm based on deep reinforcement learning method is designed, including state space, action space, network structure, reward function considering slip behavior, and training method based on proximal policy optimization. In addition, to improve the generalization ability to different lunar surface topography and different scale environments, a variety of training scenarios were set up to train the network model using the idea of curriculum learning. The simulation results show that the proposed planning algorithm can successfully achieve the end-to-end path planning of the lunar rover, and the path generated by the proposed algorithm has a higher safety guarantee compared with the classical path planning algorithm.


Author(s):  
Dayal R. Parhi ◽  
Animesh Chhotray

PurposeThis paper aims to generate an obstacle free real time optimal path in a cluttered environment for a two-wheeled mobile robot (TWMR).Design/methodology/approachThis TWMR resembles an inverted pendulum having an intermediate body mounted on a robotic mobile platform with two wheels driven by two DC motors separately. In this article, a novel motion planning strategy named as DAYANI arc contour intelligent technique has been proposed for navigation of the two-wheeled self-balancing robot in a global environment populated by obstacles. The developed new path planning algorithm evaluates the best next feasible point of motion considering five weight functions from an arc contour depending upon five separate navigational parameters.FindingsAuthenticity of the proposed navigational algorithm has been demonstrated by computing the path length and time taken through a series of simulations and experimental verifications and the average percentage of error is found to be about 6%.Practical implicationsThis robot dynamically stabilizes itself with taller configuration, can spin on the spot and rove along through obstacles with smaller footprints. This diversifies its areas of application to both indoor and outdoor environments especially with very narrow spaces, sharp turns and inclined surfaces where its multi-wheel counterparts feel difficult to perform.Originality/valueA new obstacle avoidance and path planning algorithm through incremental step advancement by evaluating the best next feasible point of motion has been established and verified through both experiment and simulation.


Sign in / Sign up

Export Citation Format

Share Document