Micro-LED backlight module by deep reinforcement learning and micro-macro-hybrid environment control agent

Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation

Applied Sciences ◽

10.3390/app11073257 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3257

Author(s):

Chen-Huan Pi ◽

Wei-Yuan Ye ◽

Stone Cheng

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Control Strategy ◽

External Disturbance ◽

Control Agent ◽

Network Control ◽

Outdoor Environment ◽

Disturbance Compensation ◽

Tracking Accuracy ◽

Control Scheme

In this paper, a novel control strategy is presented for reinforcement learning with disturbance compensation to solve the problem of quadrotor positioning under external disturbance. The proposed control scheme applies a trained neural-network-based reinforcement learning agent to control the quadrotor, and its output is directly mapped to four actuators in an end-to-end manner. The proposed control scheme constructs a disturbance observer to estimate the external forces exerted on the three axes of the quadrotor, such as wind gusts in an outdoor environment. By introducing an interference compensator into the neural network control agent, the tracking accuracy and robustness were significantly increased in indoor and outdoor experiments. The experimental results indicate that the proposed control strategy is highly robust to external disturbances. In the experiments, compensation improved control accuracy and reduced positioning error by 75%. To the best of our knowledge, this study is the first to achieve quadrotor positioning control through low-level reinforcement learning by using a global positioning system in an outdoor environment.

Download Full-text

A Kind of Reinforcement Learning to Improve Genetic Algorithm for Multiagent Task Scheduling

Mathematical Problems in Engineering ◽

10.1155/2021/1796296 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Zhipeng Li ◽

Xiumei Wei ◽

Xuesong Jiang ◽

Yewen Pang

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Task Scheduling ◽

Intelligent Control ◽

Manufacturing Systems ◽

Task Assignment ◽

Control Agent ◽

Multiple Production ◽

Intelligent Control System ◽

Standard Task

It is difficult to coordinate the various processes in the process industry. We built a multiagent distributed hierarchical intelligent control model for manufacturing systems integrating multiple production units based on multiagent system technology. The model organically combines multiple intelligent agent modules and physical entities to form an intelligent control system with certain functions. The model consists of system management agent, workshop control agent, and equipment agent. For the task assignment problem with this model, we combine reinforcement learning to improve the genetic algorithm for multiagent task scheduling and use the standard task scheduling dataset in OR-Library for simulation experiment analysis. Experimental results show that the algorithm is superior.

Download Full-text

The Study of Reinforcement Learning for Traffic Self-Adaptive Control under Multiagent Markov Game Environment

Mathematical Problems in Engineering ◽

10.1155/2013/962869 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10 ◽

Cited By ~ 7

Author(s):

Lun-Hui Xu ◽

Xin-Hai Xia ◽

Qiang Luo

Keyword(s):

Adaptive Control ◽

Reinforcement Learning ◽

Imperfect Information ◽

Single Agent ◽

Game Problem ◽

Control Agent ◽

Urban Traffic ◽

Traffic Signal Control ◽

Markov Game ◽

Self Adaptive

Urban traffic self-adaptive control problem is dynamic and uncertain, so the states of traffic environment are hard to be observed. Efficient agent which controls a single intersection can be discovered automatically via multiagent reinforcement learning. However, in the majority of the previous works on this approach, each agent needed perfect observed information when interacting with the environment and learned individually with less efficient coordination. This study casts traffic self-adaptive control as a multiagent Markov game problem. The design employs traffic signal control agent (TSCA) for each signalized intersection that coordinates with neighboring TSCAs. A mathematical model for TSCAs’ interaction is built based on nonzero-sum markov game which has been applied to let TSCAs learn how to cooperate. A multiagent Markov game reinforcement learning approach is constructed on the basis of single-agentQ-learning. This method lets each TSCA learn to update itsQ-values under the joint actions and imperfect information. The convergence of the proposed algorithm is analyzed theoretically. The simulation results show that the proposed method is convergent and effective in realistic traffic self-adaptive control setting.

Download Full-text

Reinforcement Learning for Fuzzy Agents: Application to a Pighouse Environment Control

New Learning Paradigms in Soft Computing - Studies in Fuzziness and Soft Computing ◽

10.1007/978-3-7908-1803-1_7 ◽

2002 ◽

pp. 181-230 ◽

Cited By ~ 1

Author(s):

Lionel Jouffe

Keyword(s):

Reinforcement Learning ◽

Environment Control

Download Full-text

Robust Reinforcement Learning

Neural Computation ◽

10.1162/0899766053011528 ◽

2005 ◽

Vol 17 (2) ◽

pp. 335-359 ◽

Cited By ~ 45

Author(s):

Jun Morimoto ◽

Kenji Doya

Keyword(s):

Reinforcement Learning ◽

Value Function ◽

Learning Algorithm ◽

Action Planning ◽

Control Agent ◽

Control Input ◽

Environmental Models ◽

Online Learning Algorithms ◽

The Difference ◽

The Value Function

This letter proposes a new reinforcement learning (RL) paradigm that explicitly takes into account input disturbance as well as modeling errors. The use of environmental models in RL is quite popular for both off-line learning using simulations and for online action planning. However, the difference between the model and the real environment can lead to unpredictable, and often unwanted, results. Based on the theory of H∞ control, we consider a differential game in which a “disturbing” agent tries to make the worst possible disturbance while a “control” agent tries to make the best control input. The problem is formulated as finding a min-max solution of a value function that takes into account the amount of the reward and the norm of the disturbance. We derive online learning algorithms for estimating the value function and for calculating the worst disturbance and the best control in reference to the value function. We tested the paradigm, which we call robust reinforcement learning (RRL), on the control task of an inverted pendulum. In the linear domain, the policy and the value function learned by online algorithms coincided with those derived analytically by the linear H∞ control theory. For a fully nonlinear swing-up task, RRL achieved robust performance with changes in the pendulum weight and friction, while a standard reinforcement learning algorithm could not deal with these changes. We also applied RRL to the cart-pole swing-up task, and a robust swing-up policy was acquired.

Download Full-text

Traffic Signal Control Agent Interaction Model Based on Game Theory and Reinforcement Learning

2009 International Forum on Computer Science-Technology and Applications ◽

10.1109/ifcsta.2009.47 ◽

2009 ◽

Cited By ~ 6

Author(s):

Xia Xinhai ◽

Xu Lunhui

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

Interaction Model ◽

Control Agent ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Model Based ◽

Agent Interaction

Download Full-text

Spellcaster Control Agent in StarCraft II Using Deep Reinforcement Learning

Electronics ◽

10.3390/electronics9060996 ◽

2020 ◽

Vol 9 (6) ◽

pp. 996

Author(s):

Wooseok Song ◽

Woong Hyun Suh ◽

Chang Wook Ahn

Keyword(s):

Reinforcement Learning ◽

Force Field ◽

Real Time ◽

Transfer Learning ◽

Main Idea ◽

Control Agent ◽

Training Methods ◽

Training Method ◽

Combat Training

This paper proposes a DRL -based training method for spellcaster units in StarCraft II, one of the most representative Real-Time Strategy (RTS) games. During combat situations in StarCraft II, micro-controlling various combat units is crucial in order to win the game. Among many other combat units, the spellcaster unit is one of the most significant components that greatly influences the combat results. Despite the importance of the spellcaster units in combat, training methods to carefully control spellcasters have not been thoroughly considered in related studies due to the complexity. Therefore, we suggest a training method for spellcaster units in StarCraft II by using the A3C algorithm. The main idea is to train two Protoss spellcaster units under three newly designed minigames, each representing a unique spell usage scenario, to use ‘Force Field’ and ‘Psionic Storm’ effectively. As a result, the trained agents show winning rates of more than 85% in each scenario. We present a new training method for spellcaster units that releases the limitation of StarCraft II AI research. We expect that our training method can be used for training other advanced and tactical units by applying transfer learning in more complex minigame scenarios or full game maps.

Download Full-text

Reinforcement Learning for Ramp Control: An Analysis of Learning Parameters

PROMET - Traffic&Transportation ◽

10.7307/ptt.v28i4.1830 ◽

2016 ◽

Vol 28 (4) ◽

pp. 371-381 ◽

Cited By ~ 3

Author(s):

Chao Lu ◽

Jie Huang ◽

Jianwei Gong

Keyword(s):

Reinforcement Learning ◽

Discount Rate ◽

Model Simulation ◽

Action Selection ◽

Control Agent ◽

Learning Rate ◽

Superior Performance ◽

Algorithm Performance ◽

Selection Parameter ◽

Ramp Control

Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestionsabout how to select suitable parameter values that can achieve a superior performance were provided.

Download Full-text

Reinforcement Learning-based HVAC Control Agent for Optimal Control of Particulate Matter in Railway Stations

The Transactions of The Korean Institute of Electrical Engineers ◽

10.5370/kiee.2021.70.10.1594 ◽

2021 ◽

Vol 70 (10) ◽

pp. 1594-1600

Author(s):

Kyung-bin Kwon ◽

Sumin Hong ◽

Jae-Haeng Heo ◽

Hosung Jung ◽

Jong-young Park

Keyword(s):

Optimal Control ◽

Particulate Matter ◽

Reinforcement Learning ◽

Control Agent ◽

Railway Stations ◽

Hvac Control

Download Full-text

Industrial contact dermatitis. Effect of the riot control agent ortho-chlorobenzylidene malononitrile

Archives of Dermatology ◽

10.1001/archderm.107.2.212 ◽

1973 ◽

Vol 107 (2) ◽

pp. 212-216 ◽

Cited By ~ 8

Author(s):

E. Shmunes

Keyword(s):

Contact Dermatitis ◽

Control Agent ◽

Riot Control

Download Full-text