scholarly journals A Stable Distributed Neural Controller for Physically Coupled Networked Discrete-Time System via Online Reinforcement Learning

Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Jian Sun ◽  
Jie Li

The large scale, time varying, and diversification of physically coupled networked infrastructures such as power grid and transportation system lead to the complexity of their controller design, implementation, and expansion. For tackling these challenges, we suggest an online distributed reinforcement learning control algorithm with the one-layer neural network for each subsystem or called agents to adapt the variation of the networked infrastructures. Each controller includes a critic network and action network for approximating strategy utility function and desired control law, respectively. For avoiding a large number of trials and improving the stability, the training of action network introduces supervised learning mechanisms into reduction of long-term cost. The stability of the control system with learning algorithm is analyzed; the upper bound of the tracking error and neural network weights are also estimated. The effectiveness of our proposed controller is illustrated in the simulation; the results indicate the stability under communication delay and disturbances as well.

2018 ◽  
Vol 30 (7) ◽  
pp. 1983-2004 ◽  
Author(s):  
Yazhou Hu ◽  
Bailu Si

We propose a neural network model for reinforcement learning to control a robotic manipulator with unknown parameters and dead zones. The model is composed of three networks. The state of the robotic manipulator is predicted by the state network of the model, the action policy is learned by the action network, and the performance index of the action policy is estimated by a critic network. The three networks work together to optimize the performance index based on the reinforcement learning control scheme. The convergence of the learning methods is analyzed. Application of the proposed model on a simulated two-link robotic manipulator demonstrates the effectiveness and the stability of the model.


2021 ◽  
pp. 002029402110211
Author(s):  
Tao Chen ◽  
Damin Cao ◽  
Jiaxin Yuan ◽  
Hui Yang

This paper proposes an observer-based adaptive neural network backstepping sliding mode controller to ensure the stability of switched fractional order strict-feedback nonlinear systems in the presence of arbitrary switchings and unmeasured states. To avoid “explosion of complexity” and obtain fractional derivatives for virtual control functions continuously, the fractional order dynamic surface control (DSC) technology is introduced into the controller. An observer is used for states estimation of the fractional order systems. The sliding mode control technology is introduced to enhance robustness. The unknown nonlinear functions and uncertain disturbances are approximated by the radial basis function neural networks (RBFNNs). The stability of system is ensured by the constructed Lyapunov functions. The fractional adaptive laws are proposed to update uncertain parameters. The proposed controller can ensure convergence of the tracking error and all the states remain bounded in the closed-loop systems. Lastly, the feasibility of the proposed control method is proved by giving two examples.


2021 ◽  
Vol 10 (1) ◽  
pp. 21
Author(s):  
Omar Nassef ◽  
Toktam Mahmoodi ◽  
Foivos Michelinakis ◽  
Kashif Mahmood ◽  
Ahmed Elmokashfi

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3276
Author(s):  
Szymon Szczęsny ◽  
Damian Huderek ◽  
Łukasz Przyborowski

The paper describes the architecture of a Spiking Neural Network (SNN) for time waveform analyses using edge computing. The network model was based on the principles of preprocessing signals in the diencephalon and using tonic spiking and inhibition-induced spiking models typical for the thalamus area. The research focused on a significant reduction of the complexity of the SNN algorithm by eliminating most synaptic connections and ensuring zero dispersion of weight values concerning connections between neuron layers. The paper describes a network mapping and learning algorithm, in which the number of variables in the learning process is linearly dependent on the size of the patterns. The works included testing the stability of the accuracy parameter for various network sizes. The described approach used the ability of spiking neurons to process currents of less than 100 pA, typical of amperometric techniques. An example of a practical application is an analysis of vesicle fusion signals using an amperometric system based on Carbon NanoTube (CNT) sensors. The paper concludes with a discussion of the costs of implementing the network as a semiconductor structure.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xiaoyi Long ◽  
Zheng He ◽  
Zhongyuan Wang

This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network (NN)-based reinforcement learning (RL) method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. To solve the obtained optimal control via the framework of adaptive dynamic programming (ADP), the command trajectory to be tracked and the modified tracking Hamilton-Jacobi-Bellman (HJB) are all formulated. An online RL algorithm is the developed to address the HJB equation using a critic NN with online learning algorithm. Simulation results are given to verify the effectiveness of the proposed method.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Ruiguo Liu ◽  
Xuehui Gao

A new neural network sliding mode control (NNSMC) is proposed for backlash-like hysteresis nonlinear system in this paper. Firstly, only one neural network is designed to estimate the unknown system states and hysteresis section instead of multiscale neural network at former researches since that can save computation and simplify the controller design. Secondly, a new NNSMC is proposed for the hysteresis nonlinearity where it does not need tracking error transformation. Finally, the Lyapunov functions are adopted to guarantee the stabilities of the identification and control strategies semiglobally uniformly ultimately bounded (UUB). Two cases simulations are proved the effectiveness of the presented identification approach and the performance of the NNSMC.


Author(s):  
Chunyi Wu ◽  
Gaochao Xu ◽  
Yan Ding ◽  
Jia Zhao

Large-scale tasks processing based on cloud computing has become crucial to big data analysis and disposal in recent years. Most previous work, generally, utilize the conventional methods and architectures for general scale tasks to achieve tons of tasks disposing, which is limited by the issues of computing capability, data transmission, etc. Based on this argument, a fat-tree structure-based approach called LTDR (Large-scale Tasks processing using Deep network model and Reinforcement learning) has been proposed in this work. Aiming at exploring the optimal task allocation scheme, a virtual network mapping algorithm based on deep convolutional neural network and [Formula: see text]-learning is presented herein. After feature extraction, we design and implement a policy network to make node mapping decisions. The link mapping scheme can be attained by the designed distributed value-function based reinforcement learning model. Eventually, tasks are allocated onto proper physical nodes and processed efficiently. Experimental results show that LTDR can significantly improve the utilization of physical resources and long-term revenue while satisfying task requirements in big data.


2020 ◽  
Vol 39 (7) ◽  
pp. 856-892 ◽  
Author(s):  
Tingxiang Fan ◽  
Pinxin Long ◽  
Wenxi Liu ◽  
Jia Pan

Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots’ states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent’s steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy’s robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller’s robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning. Videos are available at https://sites.google.com/view/hybridmrca .


Sign in / Sign up

Export Citation Format

Share Document