Reliability-Based Reinforcement Learning Under Uncertainty

Volume 11A: 46th Design Automation Conference (DAC) ◽

10.1115/detc2020-22019 ◽

2020 ◽

Author(s):

Zequn Wang ◽

Narendra Patwardhan

Keyword(s):

Reinforcement Learning ◽

Inverted Pendulum ◽

Design Space ◽

Controller Design ◽

Learning Tasks ◽

Widespread Acceptance ◽

Optimization Routine ◽

Reliability Based Optimization ◽

The Stability ◽

Data Efficiency

Abstract Despite the numerous advances, reinforcement learning remains away from widespread acceptance for autonomous controller design as compared to classical methods due to lack of ability to effectively tackle uncertainty. The reliance on absolute or deterministic reward as a metric for optimization process renders reinforcement learning highly susceptible to changes in problem dynamics. We introduce a novel framework that effectively quantify the uncertainty in the design space and induces robustness in controllers by switching to a reliability-based optimization routine. A model-based approach is used to improve the data efficiency of the method while predicting the system dynamics. We prove the stability of learned neuro-controllers in both static and dynamic environments on classical reinforcement learning tasks such as Cart Pole balancing and Inverted Pendulum.

Download Full-text

Robust Controller Design for Inverted Pendulum System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.631-632.1342 ◽

2013 ◽

Vol 631-632 ◽

pp. 1342-1347

Author(s):

Xu Cao ◽

Nian Feng Li ◽

Hua Xun Zhang

Keyword(s):

Inverted Pendulum ◽

Controller Design ◽

Robust Controller ◽

Important Indicator ◽

Pendulum System ◽

Inverted Pendulum System ◽

Lqr Controller ◽

Coupling Characteristics ◽

The Stability ◽

Robust Controller Design

For the high order, unstable, multivariable, nonlinear and strong coupling characteristics, robust stability is an important indicator of inverted pendulum system. In this paper an LQR robust controller of inverter pendulum system is designed. The simulation and the experimental results showed that the stability of the robust LQR controller is better than the original LQR controller. When the system departure counterpoise for all kinds of reasons, it get back equilibrium state without depleting any energy, and approach state of equilibrium of all state component.

Download Full-text

Sharing Experience in Multitask Reinforcement Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/505 ◽

2019 ◽

Author(s):

Tung-Long Vuong ◽

Do-Van Nguyen ◽

Tai-Long Nguyen ◽

Cong-Minh Bui ◽

Hai-Dang Kieu ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Process ◽

Learning Task ◽

Learning Tasks ◽

Shared Space ◽

Multiple Tasks ◽

The Stability

In multitask reinforcement learning, tasks often have sub-tasks that share the same solution, even though the overall tasks are different. If the shared-portions could be effectively identified, then the learning process could be improved since all the samples between tasks in the shared space could be used. In this paper, we propose a Sharing Experience Framework (SEF) for simultaneously training of multiple tasks. In SEF, a confidence sharing agent uses task-specific rewards from the environment to identify similar parts that should be shared across tasks and defines those parts as shared-regions between tasks. The shared-regions are expected to guide task-policies sharing their experience during the learning process. The experiments highlight that our framework improves the performance and the stability of learning task-policies, and is possible to help task-policies avoid local optimums.

Download Full-text

A Stable Distributed Neural Controller for Physically Coupled Networked Discrete-Time System via Online Reinforcement Learning

Complexity ◽

10.1155/2018/5950678 ◽

2018 ◽

Vol 2018 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

Jian Sun ◽

Jie Li

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Large Scale ◽

Controller Design ◽

Learning Algorithm ◽

Tracking Error ◽

Discrete Time System ◽

Learning Mechanisms ◽

Action Network ◽

The Stability

The large scale, time varying, and diversification of physically coupled networked infrastructures such as power grid and transportation system lead to the complexity of their controller design, implementation, and expansion. For tackling these challenges, we suggest an online distributed reinforcement learning control algorithm with the one-layer neural network for each subsystem or called agents to adapt the variation of the networked infrastructures. Each controller includes a critic network and action network for approximating strategy utility function and desired control law, respectively. For avoiding a large number of trials and improving the stability, the training of action network introduces supervised learning mechanisms into reduction of long-term cost. The stability of the control system with learning algorithm is analyzed; the upper bound of the tracking error and neural network weights are also estimated. The effectiveness of our proposed controller is illustrated in the simulation; the results indicate the stability under communication delay and disturbances as well.

Download Full-text

Computational Design of Modular Robots Based on Genetic Algorithm and Reinforcement Learning

Symmetry ◽

10.3390/sym13030471 ◽

2021 ◽

Vol 13 (3) ◽

pp. 471

Author(s):

Jai Hoon Park ◽

Kang Hoon Lee

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Design Space ◽

Learning Algorithm ◽

Computational Design ◽

Computational Method ◽

Learning Ability ◽

Modular Robots ◽

Control Mechanisms ◽

Candidate Structure

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.

Download Full-text

Control of an Inverted Pendulum by Reinforcement Learning Method in PLC Environment

2020 Innovations in Intelligent Systems and Applications Conference (ASYU) ◽

10.1109/asyu50717.2020.9259890 ◽

2020 ◽

Author(s):

Gokhan Demirkiran ◽

Ozcan Erdener ◽

Onay Akpinar ◽

Pelin Demirtas ◽

M. Yagiz Arik ◽

...

Keyword(s):

Reinforcement Learning ◽

Inverted Pendulum ◽

Learning Method

Download Full-text

A direct approach to adaptive controller design and its application to inverted pendulum tracking

Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207) ◽

10.1109/acc.1998.703568 ◽

1998 ◽

Cited By ~ 10

Author(s):

S.C. Ge ◽

C.C. Hang ◽

T. Zhang

Keyword(s):

Inverted Pendulum ◽

Controller Design ◽

Adaptive Controller ◽

Direct Approach

Download Full-text

Selective network discovery via deep reinforcement learning on embedded spaces

Applied Network Science ◽

10.1007/s41109-021-00365-8 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Peter Morales ◽

Rajmonda Sulo Caceres ◽

Tina Eliassi-Rad

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Sequential Decision ◽

Network Discovery ◽

Learning Tasks ◽

Partially Observed ◽

Decision Making Problem ◽

Resource Collection ◽

Improved Performance ◽

Discovery Algorithms

AbstractComplex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstream learning tasks given resource collection constraints are of great interest. In this paper, we formulate the task-specific network discovery problem as a sequential decision-making problem. Our downstream task is selective harvesting, the optimal collection of vertices with a particular attribute. We propose a framework, called network actor critic (NAC), which learns a policy and notion of future reward in an offline setting via a deep reinforcement learning algorithm. The NAC paradigm utilizes a task-specific network embedding to reduce the state space complexity. A detailed comparative analysis of popular network embeddings is presented with respect to their role in supporting offline planning. Furthermore, a quantitative study is presented on various synthetic and real benchmarks using NAC and several baselines. We show that offline models of reward and network discovery policies lead to significantly improved performance when compared to competitive online discovery algorithms. Finally, we outline learning regimes where planning is critical in addressing sparse and changing reward signals.

Download Full-text

A real-time HIL control system on rotary inverted pendulum hardware platform based on double deep Q-network

Measurement and Control ◽

10.1177/00202940211000380 ◽

2021 ◽

Vol 54 (3-4) ◽

pp. 417-428

Author(s):

Yanyan Dai ◽

KiDong Lee ◽

SukGyu Lee

Keyword(s):

Control System ◽

Reinforcement Learning ◽

Inverted Pendulum ◽

Learning Algorithm ◽

Deep Understanding ◽

Control Engineering ◽

Experience Replay ◽

Real Hardware ◽

Rotary Inverted Pendulum ◽

Reinforcement Learning Algorithm

For real applications, rotary inverted pendulum systems have been known as the basic model in nonlinear control systems. If researchers have no deep understanding of control, it is difficult to control a rotary inverted pendulum platform using classic control engineering models, as shown in section 2.1. Therefore, without classic control theory, this paper controls the platform by training and testing reinforcement learning algorithm. Many recent achievements in reinforcement learning (RL) have become possible, but there is a lack of research to quickly test high-frequency RL algorithms using real hardware environment. In this paper, we propose a real-time Hardware-in-the-loop (HIL) control system to train and test the deep reinforcement learning algorithm from simulation to real hardware implementation. The Double Deep Q-Network (DDQN) with prioritized experience replay reinforcement learning algorithm, without a deep understanding of classical control engineering, is used to implement the agent. For the real experiment, to swing up the rotary inverted pendulum and make the pendulum smoothly move, we define 21 actions to swing up and balance the pendulum. Comparing Deep Q-Network (DQN), the DDQN with prioritized experience replay algorithm removes the overestimate of Q value and decreases the training time. Finally, this paper shows the experiment results with comparisons of classic control theory and different reinforcement learning algorithms.

Download Full-text

Actuator Saturated Fuzzy Controller Design for Interval Type-2 Takagi-Sugeno Fuzzy Models with Multiplicative Noises

Processes ◽

10.3390/pr9050823 ◽

2021 ◽

Vol 9 (5) ◽

pp. 823

Author(s):

Wen-Jer Chang ◽

Yu-Wei Lin ◽

Yann-Horng Lin ◽

Chin-Lin Pen ◽

Ming-Hsuan Tsai

Keyword(s):

Nonlinear Systems ◽

Fuzzy Controller ◽

Controller Design ◽

Actuator Saturation ◽

Design Method ◽

Fuzzy Model ◽

The Stability ◽

Interval Type ◽

Takagi Sugeno

In many practical systems, stochastic behaviors usually occur and need to be considered in the controller design. To ensure the system performance under the effect of stochastic behaviors, the controller may become bigger even beyond the capacity of practical applications. Therefore, the actuator saturation problem also must be considered in the controller design. The type-2 Takagi-Sugeno (T-S) fuzzy model can describe the parameter uncertainties more completely than the type-1 T-S fuzzy model for a class of nonlinear systems. A fuzzy controller design method is proposed in this paper based on the Interval Type-2 (IT2) T-S fuzzy model for stochastic nonlinear systems subject to actuator saturation. The stability analysis and some corresponding sufficient conditions for the IT2 T-S fuzzy model are developed using Lyapunov theory. Via transferring the stability and control problem into Linear Matrix Inequality (LMI) problem, the proposed fuzzy control problem can be solved by the convex optimization algorithm. Finally, a nonlinear ship steering system is considered in the simulations to verify the feasibility and efficiency of the proposed fuzzy controller design method.

Download Full-text

A Functional Representation Model Facilitating Design Space Expansion

ISRN Mechanical Engineering ◽

10.1155/2013/686402 ◽

2013 ◽

Vol 2013 ◽

pp. 1-15

Author(s):

Wei Xu ◽

Ke Zhao ◽

Yatao Li ◽

Peitao Cheng

Keyword(s):

Set Theory ◽

Propositional Logic ◽

Design Space ◽

Functional Representation ◽

Functional Reasoning ◽

Variation Range ◽

Relational Calculus ◽

Event Model ◽

Space Expansion ◽

The Stability

This paper addresses the functional representation based on the event model. In the event model, the ontology is defined based on the theory of propositional logic to describe the connotation of the event, and the variant is defined based on the theories of domain relational calculus and set theory to express the variation range of the event, which is alterable part of the event under the constraints of the ontology. Function is an important concept in conceptual design and has its connotation and extension. The functional representation is proposed based on the event model. The ontology of event is used to describe the connotation of function and to reflect the stability of function. The variant of the event is used to represent the extension and to incarnate the variety of function. The extension of function is the change range of function under the constraints of the connotation. The proposed functional representation divides the function into the immutable part and the alterable part, facilitating the expansion of design space. A functional reasoning model is also put forward based on the event model to support the function reasoning on the computers. Finally, a simple case validates the feasibility of the model.

Download Full-text