A Stable Distributed Neural Controller for Physically Coupled Networked Discrete-Time System via Online Reinforcement Learning

Complexity ◽

10.1155/2018/5950678 ◽

2018 ◽

Vol 2018 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

Jian Sun ◽

Jie Li

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Large Scale ◽

Controller Design ◽

Learning Algorithm ◽

Tracking Error ◽

Discrete Time System ◽

Learning Mechanisms ◽

Action Network ◽

The Stability

The large scale, time varying, and diversification of physically coupled networked infrastructures such as power grid and transportation system lead to the complexity of their controller design, implementation, and expansion. For tackling these challenges, we suggest an online distributed reinforcement learning control algorithm with the one-layer neural network for each subsystem or called agents to adapt the variation of the networked infrastructures. Each controller includes a critic network and action network for approximating strategy utility function and desired control law, respectively. For avoiding a large number of trials and improving the stability, the training of action network introduces supervised learning mechanisms into reduction of long-term cost. The stability of the control system with learning algorithm is analyzed; the upper bound of the tracking error and neural network weights are also estimated. The effectiveness of our proposed controller is illustrated in the simulation; the results indicate the stability under communication delay and disturbances as well.

Download Full-text

A Reinforcement Learning Neural Network for Robotic Manipulator Control

Neural Computation ◽

10.1162/neco_a_01079 ◽

2018 ◽

Vol 30 (7) ◽

pp. 1983-2004 ◽

Cited By ~ 8

Author(s):

Yazhou Hu ◽

Bailu Si

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Performance Index ◽

Robotic Manipulator ◽

The State ◽

Unknown Parameters ◽

Proposed Model ◽

Action Network ◽

Action Policy ◽

The Stability

We propose a neural network model for reinforcement learning to control a robotic manipulator with unknown parameters and dead zones. The model is composed of three networks. The state of the robotic manipulator is predicted by the state network of the model, the action policy is learned by the action network, and the performance index of the action policy is estimated by a critic network. The three networks work together to optimize the performance index based on the reinforcement learning control scheme. The convergence of the learning methods is analyzed. Application of the proposed model on a simulated two-link robotic manipulator demonstrates the effectiveness and the stability of the model.

Download Full-text

Observer-based adaptive neural network backstepping sliding mode control for switched fractional order uncertain nonlinear systems with unmeasured states

Measurement and Control ◽

10.1177/00202940211021107 ◽

2021 ◽

pp. 002029402110211

Author(s):

Tao Chen ◽

Damin Cao ◽

Jiaxin Yuan ◽

Hui Yang

Keyword(s):

Neural Network ◽

Nonlinear Systems ◽

Sliding Mode Control ◽

Fractional Order ◽

Sliding Mode ◽

Tracking Error ◽

Adaptive Neural Network ◽

Dynamic Surface ◽

Mode Control ◽

The Stability

This paper proposes an observer-based adaptive neural network backstepping sliding mode controller to ensure the stability of switched fractional order strict-feedback nonlinear systems in the presence of arbitrary switchings and unmeasured states. To avoid “explosion of complexity” and obtain fractional derivatives for virtual control functions continuously, the fractional order dynamic surface control (DSC) technology is introduced into the controller. An observer is used for states estimation of the fractional order systems. The sliding mode control technology is introduced to enhance robustness. The unknown nonlinear functions and uncertain disturbances are approximated by the radial basis function neural networks (RBFNNs). The stability of system is ensured by the constructed Lyapunov functions. The fractional adaptive laws are proposed to update uncertain parameters. The proposed controller can ensure convergence of the tracking error and all the states remain bounded in the closed-loop systems. Lastly, the feasibility of the proposed control method is proved by giving two examples.

Download Full-text

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Alexandria Engineering Journal ◽

10.1016/j.aej.2021.01.030 ◽

2021 ◽

Vol 60 (3) ◽

pp. 2787-2800

Author(s):

Jianfeng Ren ◽

Chunming Ye ◽

Feng Yang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Value Function ◽

Flow Shop ◽

Learning Algorithm ◽

Flow Shop Scheduling ◽

Scheduling Problem ◽

Shop Scheduling ◽

The Value Function ◽

Reinforcement Learning Algorithm

Download Full-text

Optimising Performance for NB-IoT UE Devices through Data Driven Models

Journal of Sensor and Actuator Networks ◽

10.3390/jsan10010021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 21

Author(s):

Omar Nassef ◽

Toktam Mahmoodi ◽

Foivos Michelinakis ◽

Kashif Mahmood ◽

Ahmed Elmokashfi

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Gradient Descent ◽

Deep Neural Network ◽

Narrow Band ◽

Learning Algorithm ◽

Base Station ◽

User Equipment ◽

Data Driven ◽

Superior Performance

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.

Download Full-text

Spiking Neural Network with Linear Computational Complexity for Waveform Analysis in Amperometry

Sensors ◽

10.3390/s21093276 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3276

Author(s):

Szymon Szczęsny ◽

Damian Huderek ◽

Łukasz Przyborowski

Keyword(s):

Neural Network ◽

Learning Algorithm ◽

Waveform Analysis ◽

Spiking Neural Network ◽

Synaptic Connections ◽

Practical Application ◽

Accuracy Parameter ◽

The Stability ◽

Time Waveform ◽

Network Mapping

The paper describes the architecture of a Spiking Neural Network (SNN) for time waveform analyses using edge computing. The network model was based on the principles of preprocessing signals in the diencephalon and using tonic spiking and inhibition-induced spiking models typical for the thalamus area. The research focused on a significant reduction of the complexity of the SNN algorithm by eliminating most synaptic connections and ensuring zero dispersion of weight values concerning connections between neuron layers. The paper describes a network mapping and learning algorithm, in which the number of variables in the learning process is linearly dependent on the size of the patterns. The works included testing the stability of the accuracy parameter for various network sizes. The described approach used the ability of spiking neurons to process currents of less than 100 pA, typical of amperometric techniques. An example of a practical application is an analysis of vesicle fusion signals using an amperometric system based on Carbon NanoTube (CNT) sensors. The paper concludes with a discussion of the costs of implementing the network as a semiconductor structure.

Download Full-text

Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning

Complexity ◽

10.1155/2021/8839391 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Xiaoyi Long ◽

Zheng He ◽

Zhongyuan Wang

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Tracking Control ◽

Learning Algorithm ◽

Tracking Error ◽

Adaptive Dynamic Programming ◽

Robotic Systems ◽

Control Synthesis ◽

Optimal Tracking ◽

Optimal Tracking Control

This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network (NN)-based reinforcement learning (RL) method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. To solve the obtained optimal control via the framework of adaptive dynamic programming (ADP), the command trajectory to be tracked and the modified tracking Hamilton-Jacobi-Bellman (HJB) are all formulated. An online RL algorithm is the developed to address the HJB equation using a critic NN with online learning algorithm. Simulation results are given to verify the effectiveness of the proposed method.

Download Full-text

Neural Network Identification and Sliding Mode Control for Hysteresis Nonlinear System with Backlash-Like Model

Complexity ◽

10.1155/2019/4949265 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Ruiguo Liu ◽

Xuehui Gao

Keyword(s):

Neural Network ◽

Nonlinear System ◽

Sliding Mode Control ◽

Controller Design ◽

Sliding Mode ◽

Control Strategies ◽

Tracking Error ◽

Mode Control ◽

Hysteresis Nonlinearity ◽

System States

A new neural network sliding mode control (NNSMC) is proposed for backlash-like hysteresis nonlinear system in this paper. Firstly, only one neural network is designed to estimate the unknown system states and hysteresis section instead of multiscale neural network at former researches since that can save computation and simplify the controller design. Secondly, a new NNSMC is proposed for the hysteresis nonlinearity where it does not need tracking error transformation. Finally, the Lyapunov functions are adopted to guarantee the stabilities of the identification and control strategies semiglobally uniformly ultimately bounded (UUB). Two cases simulations are proved the effectiveness of the presented identification approach and the performance of the NNSMC.

Download Full-text

Explore Deep Neural Network and Reinforcement Learning to Large-scale Tasks Processing in Big Data

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419510108 ◽

2019 ◽

Vol 33 (13) ◽

pp. 1951010 ◽

Cited By ~ 1

Author(s):

Chunyi Wu ◽

Gaochao Xu ◽

Yan Ding ◽

Jia Zhao

Keyword(s):

Neural Network ◽

Big Data ◽

Reinforcement Learning ◽

Large Scale ◽

Policy Network ◽

Text Learning ◽

Mapping Algorithm ◽

Task Requirements ◽

Network Mapping

Large-scale tasks processing based on cloud computing has become crucial to big data analysis and disposal in recent years. Most previous work, generally, utilize the conventional methods and architectures for general scale tasks to achieve tons of tasks disposing, which is limited by the issues of computing capability, data transmission, etc. Based on this argument, a fat-tree structure-based approach called LTDR (Large-scale Tasks processing using Deep network model and Reinforcement learning) has been proposed in this work. Aiming at exploring the optimal task allocation scheme, a virtual network mapping algorithm based on deep convolutional neural network and [Formula: see text]-learning is presented herein. After feature extraction, we design and implement a policy network to make node mapping decisions. The link mapping scheme can be attained by the designed distributed value-function based reinforcement learning model. Eventually, tasks are allocated onto proper physical nodes and processed efficiently. Experimental results show that LTDR can significantly improve the utilization of physical resources and long-term revenue while satisfying task requirements in big data.

Download Full-text

Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

The International Journal of Robotics Research ◽

10.1177/0278364920916531 ◽

2020 ◽

Vol 39 (7) ◽

pp. 856-892 ◽

Cited By ~ 4

Author(s):

Tingxiang Fan ◽

Pinxin Long ◽

Wenxi Liu ◽

Jia Pan

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Autonomous Navigation ◽

Large Scale ◽

Learning Algorithm ◽

Free Action ◽

Parameter Tuning ◽

Movement Velocity ◽

Robot Systems ◽

Multi Robot

Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots’ states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent’s steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy’s robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller’s robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning. Videos are available at https://sites.google.com/view/hybridmrca .

Download Full-text

Fault-Tolerant Controller Design for a Class of Nonlinear MIMO Discrete-Time Systems via Online Reinforcement Learning Algorithm

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2015.2478885 ◽

2016 ◽

Vol 46 (5) ◽

pp. 611-622 ◽

Cited By ~ 54

Author(s):

Zhanshan Wang ◽

Lei Liu ◽

Huaguang Zhang ◽

Geyang Xiao

Keyword(s):

Reinforcement Learning ◽

Discrete Time ◽

Controller Design ◽

Fault Tolerant ◽

Learning Algorithm ◽

Discrete Time Systems ◽

Time Systems ◽

Reinforcement Learning Algorithm

Download Full-text