scholarly journals Walking Control of a Biped Robot on Static and Rotating Platforms Based on Hybrid Reinforcement Learning

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 148411-148424
Author(s):  
Ao Xi ◽  
Chao Chen
Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4468
Author(s):  
Ao Xi ◽  
Chao Chen

In this work, we introduced a novel hybrid reinforcement learning scheme to balance a biped robot (NAO) on an oscillating platform, where the rotation of the platform is considered as the external disturbance to the robot. The platform had two degrees of freedom in rotation, pitch and roll. The state space comprised the position of center of pressure, and joint angles and joint velocities of two legs. The action space consisted of the joint angles of ankles, knees, and hips. By adding the inverse kinematics techniques, the dimension of action space was significantly reduced. Then, a model-based system estimator was employed during the offline training procedure to estimate the dynamics model of the system by using novel hierarchical Gaussian processes, and to provide initial control inputs, after which the reduced action space of each joint was obtained by minimizing the cost of reaching the desired stable state. Finally, a model-free optimizer based on DQN (λ) was introduced to fine tune the initial control inputs, where the optimal control inputs were obtained for each joint at any state. The proposed reinforcement learning not only successfully avoided the distribution mismatch problem, but also improved the sample efficiency. Simulation results showed that the proposed hybrid reinforcement learning mechanism enabled the NAO robot to balance on an oscillating platform with different frequencies and magnitudes. Both control performance and robustness were guaranteed during the experiments.


2015 ◽  
Vol 7 (6) ◽  
pp. 449-452 ◽  
Author(s):  
Ahmad Ghanbari ◽  
Yasaman Vaghei ◽  
Sayyed Mohammad Reza Sayyed Noorani

Author(s):  
Maximilian Moll ◽  
Leonhard Kunczik

AbstractIn recent history, reinforcement learning (RL) proved its capability by solving complex decision problems by mastering several games. Increased computational power and the advances in approximation with neural networks (NN) paved the path to RL’s successful applications. Even though RL can tackle more complex problems nowadays, it still relies on computational power and runtime. Quantum computing promises to solve these issues by its capability to encode information and the potential quadratic speedup in runtime. We compare tabular Q-learning and Q-learning using either a quantum or a classical approximation architecture on the frozen lake problem. Furthermore, the three algorithms are analyzed in terms of iterations until convergence to the optimal behavior, memory usage, and runtime. Within the paper, NNs are utilized for approximation in the classical domain, while in the quantum domain variational quantum circuits, as a quantum hybrid approximation method, have been used. Our simulations show that a quantum approximator is beneficial in terms of memory usage and provides a better sample complexity than NNs; however, it still lacks the computational speed to be competitive.


1997 ◽  
Vol 117 (10) ◽  
pp. 1227-1233 ◽  
Author(s):  
Sorao Kenji ◽  
Murakami Toshiyuki ◽  
Ohnishi Kouhei

2019 ◽  
Vol 9 (3) ◽  
pp. 502 ◽  
Author(s):  
Cristyan Gil ◽  
Hiram Calvo ◽  
Humberto Sossa

Programming robots for performing different activities requires calculating sequences of values of their joints by taking into account many factors, such as stability and efficiency, at the same time. Particularly for walking, state of the art techniques to approximate these sequences are based on reinforcement learning (RL). In this work we propose a multi-level system, where the same RL method is used first to learn the configuration of robot joints (poses) that allow it to stand with stability, and then in the second level, we find the sequence of poses that let it reach the furthest distance in the shortest time, while avoiding falling down and keeping a straight path. In order to evaluate this, we focus on measuring the time it takes for the robot to travel a certain distance. To our knowledge, this is the first work focusing both on speed and precision of the trajectory at the same time. We implement our model in a simulated environment using q-learning. We compare with the built-in walking modes of an NAO robot by improving normal-speed and enhancing robustness in fast-speed. The proposed model can be extended to other tasks and is independent of a particular robot model.


2021 ◽  
Vol 5 (2) ◽  
pp. 505-510
Author(s):  
Jaehyun Yoo ◽  
Dohyun Jang ◽  
H. Jin Kim ◽  
Karl H. Johansson

Sign in / Sign up

Export Citation Format

Share Document