A Method to Control Parameters of Evolutionary Algorithms by Using Reinforcement Learning

Author(s):  
Y Sakurai ◽  
K Takada ◽  
T Kawabe ◽  
S Tsuruta
2021 ◽  
Author(s):  
Shuzhen Luo ◽  
Ghaith Androwis ◽  
Sergei Adamovich ◽  
Erick Nunez ◽  
Hao Su ◽  
...  

Abstract Background: Few studies have systematically investigated robust controllers for lower limb rehabilitation exoskeletons (LLREs) that can safely and effectively assist users with a variety of neuromuscular disorders to walk with full autonomy. One of the key challenges for developing such a robust controller is to handle different degrees of uncertain human-exoskeleton interaction forces from the patients. Consequently, conventional walking controllers either are patient-condition specific or involve tuning of many control parameters, which could behave unreliably and even fail to maintain balance. Methods: We present a novel and robust controller for a LLRE based on a decoupled deep reinforcement learning framework with three independent networks, which aims to provide reliable walking assistance against various and uncertain human-exoskeleton interaction forces. The exoskeleton controller is driven by a neural network control policy that acts on a stream of the LLRE’s proprioceptive signals, including joint kinematic states, and subsequently predicts real-time position control targets for the actuated joints. To handle uncertain human-interaction forces, the control policy is trained intentionally with an integrated human musculoskeletal model and realistic human-exoskeleton interaction forces. Two other neural networks are connected with the control policy network to predict the interaction forces and muscle coordination. To further increase the robustness of the control policy, we employ domain randomization during training that includes not only randomization of exoskeleton dynamics properties but, more importantly, randomization of human muscle strength to simulate the variability of the patient’s disability. Through this decoupled deep reinforcement learning framework, the trained controller of LLREs is able to provide reliable walking assistance to the human with different degrees of neuromuscular disorders. Results and Conclusion: A universal, RL-based walking controller is trained and virtually tested on a LLRE system to verify its effectiveness and robustness in assisting users with different disabilities such as passive muscles (quadriplegic), muscle weakness, or hemiplegic conditions. An ablation study demonstrates strong robustness of the control policy under large exoskeleton dynamic property ranges and various human-exoskeleton interaction forces. The decoupled network structure allows us to isolate the LLRE control policy network for testing and sim-to-real transfer since it uses only proprioception information of the LLRE (joint sensory state) as the input. Furthermore, the controller is shown to be able to handle different patient conditions without the need for patient-specific control parameters tuning.


Symmetry ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 31
Author(s):  
Jichang Ma ◽  
Hui Xie ◽  
Kang Song ◽  
Hao Liu

The path tracking control system is a crucial component for autonomous vehicles; it is challenging to realize accurate tracking control when approaching a wide range of uncertain situations and dynamic environments, particularly when such control must perform as well as, or better than, human drivers. While many methods provide state-of-the-art tracking performance, they tend to emphasize constant PID control parameters, calibrated by human experience, to improve tracking accuracy. A detailed analysis shows that PID controllers inefficiently reduce the lateral error under various conditions, such as complex trajectories and variable speed. In addition, intelligent driving vehicles are highly non-linear objects, and high-fidelity models are unavailable in most autonomous systems. As for the model-based controller (MPC or LQR), the complex modeling process may increase the computational burden. With that in mind, a self-optimizing, path tracking controller structure, based on reinforcement learning, is proposed. For the lateral control of the vehicle, a steering method based on the fusion of the reinforcement learning and traditional PID controllers is designed to adapt to various tracking scenarios. According to the pre-defined path geometry and the real-time status of the vehicle, the interactive learning mechanism, based on an RL framework (actor–critic—a symmetric network structure), can realize the online optimization of PID control parameters in order to better deal with the tracking error under complex trajectories and dynamic changes of vehicle model parameters. The adaptive performance of velocity changes was also considered in the tracking process. The proposed controlling approach was tested in different path tracking scenarios, both the driving simulator platforms and on-site vehicle experiments have verified the effects of our proposed self-optimizing controller. The results show that the approach can adaptively change the weights of PID to maintain a tracking error (simulation: within ±0.071 m; realistic vehicle: within ±0.272 m) and steering wheel vibration standard deviations (simulation: within ±0.04°; realistic vehicle: within ±80.69°); additionally, it can adapt to high-speed simulation scenarios (the maximum speed is above 100 km/h and the average speed through curves is 63–76 km/h).


2020 ◽  
Vol 17 (3) ◽  
pp. 172988142091374
Author(s):  
Alexander Fabisch ◽  
Malte Langosz ◽  
Frank Kirchner

Reinforcement learning and behavior optimization are becoming more and more popular in the field of robotics because algorithms are mature enough to tackle real problems in this domain. Robust implementations of state-of-the-art algorithms are often not publicly available though, and experiments are hardly reproducible because open-source implementations are often not available or are still in a stage of research code. Consequently, often it is infeasible to deploy these algorithms on robotic systems. BOLeRo closes this gap for policy search and evolutionary algorithms by delivering open-source implementations of behavior learning algorithms for robots. It is easy to integrate in robotic middlewares and it can be used to compare methods and develop prototypes in simulation.


Sign in / Sign up

Export Citation Format

Share Document