A Foremost-Policy Reinforcement Learning Based ART2 Neural Network and Its Learning Algorithm

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Alexandria Engineering Journal ◽

10.1016/j.aej.2021.01.030 ◽

2021 ◽

Vol 60 (3) ◽

pp. 2787-2800

Author(s):

Jianfeng Ren ◽

Chunming Ye ◽

Feng Yang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Value Function ◽

Flow Shop ◽

Learning Algorithm ◽

Flow Shop Scheduling ◽

Scheduling Problem ◽

Shop Scheduling ◽

The Value Function ◽

Reinforcement Learning Algorithm

Download Full-text

Optimising Performance for NB-IoT UE Devices through Data Driven Models

Journal of Sensor and Actuator Networks ◽

10.3390/jsan10010021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 21

Author(s):

Omar Nassef ◽

Toktam Mahmoodi ◽

Foivos Michelinakis ◽

Kashif Mahmood ◽

Ahmed Elmokashfi

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Gradient Descent ◽

Deep Neural Network ◽

Narrow Band ◽

Learning Algorithm ◽

Base Station ◽

User Equipment ◽

Data Driven ◽

Superior Performance

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.

Download Full-text

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

Neural Computation ◽

10.1162/neco.1994.6.2.215 ◽

1994 ◽

Vol 6 (2) ◽

pp. 215-219 ◽

Cited By ~ 314

Author(s):

Gerald Tesauro

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Intermediate Level ◽

Zero Knowledge ◽

Initial Strategy ◽

Master Level ◽

Reinforcement Learning Algorithm

TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD(λ) reinforcement learning algorithm (Sutton 1988). Despite starting from random initial weights (and hence random initial strategy), TD-Gammon achieves a surprisingly strong level of play. With zero knowledge built in at the start of learning (i.e., given only a “raw” description of the board state), the network learns to play at a strong intermediate level. Furthermore, when a set of hand-crafted features is added to the network's input representation, the result is a truly staggering level of performance: the latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players.

Download Full-text

A pulse neural network reinforcement learning algorithm for partially observable Markov decision processes

Systems and Computers in Japan ◽

10.1002/scj.10645 ◽

2005 ◽

Vol 36 (3) ◽

pp. 42-52 ◽

Cited By ~ 3

Author(s):

Koichiro Takita ◽

Masafumi Hagiwara

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Markov Decision Processes ◽

Learning Algorithm ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Reinforcement Learning Algorithm

Download Full-text

A reinforcement learning algorithm used in analog spiking neural network for an adaptive cardiac Resynchronization Therapy device

Proceedings of 2010 IEEE International Symposium on Circuits and Systems ◽

10.1109/iscas.2010.5537111 ◽

2010 ◽

Author(s):

Qing Sun ◽

Francois Schwartz ◽

Jacques Michel ◽

Yannick Herve

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Cardiac Resynchronization Therapy ◽

Learning Algorithm ◽

Cardiac Resynchronization ◽

Spiking Neural Network ◽

Resynchronization Therapy ◽

Cardiac Resynchronization Therapy Device ◽

Reinforcement Learning Algorithm

Download Full-text

Utilizing unsupervised weightless neural network as autonomous states classifier in reinforcement learning algorithm

2017 IEEE 13th International Colloquium on Signal Processing & its Applications (CSPA) ◽

10.1109/cspa.2017.8064963 ◽

2017 ◽

Author(s):

Yusman Yusof ◽

H. M. Asri H. Mansor ◽

Adizul Ahmad

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Reinforcement Learning Algorithm

Download Full-text

A Reinforcement Learning Algorithm based on Neural Network for Economic Dispatch

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9188641 ◽

2020 ◽

Author(s):

Liying Yu ◽

Ning Li

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Economic Dispatch ◽

Reinforcement Learning Algorithm

Download Full-text

Reinforcement Learning Algorithm with Network Extension for Pulse Neural Network

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss1987.121.10_1634 ◽

2001 ◽

Vol 121 (10) ◽

pp. 1634-1640

Author(s):

Koichiro TAKITA ◽

Yuko OSANA ◽

Masafumi HAGIWARA

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Network Extension ◽

Reinforcement Learning Algorithm

Download Full-text

Relative control of an underactuated spacecraft using reinforcement learning

Technical mechanics ◽

10.15407/itm2020.04.043 ◽

2020 ◽

Vol 2020 (4) ◽

pp. 43-54

Author(s):

S.V. Khoroshylov ◽

◽

M.O. Redka ◽

Keyword(s):

Neural Network ◽

Control System ◽

Reinforcement Learning ◽

Control Theory ◽

Learning Algorithm ◽

Control Algorithms ◽

Iteration Algorithm ◽

Quality Of Control ◽

Control Actions

The aim of the article is to approximate optimal relative control of an underactuated spacecraft using reinforcement learning and to study the influence of various factors on the quality of such a solution. In the course of this study, methods of theoretical mechanics, control theory, stability theory, machine learning, and computer modeling were used. The problem of in-plane spacecraft relative control using only control actions applied tangentially to the orbit is considered. This approach makes it possible to reduce the propellant consumption of reactive actuators and to simplify the architecture of the control system. However, in some cases, methods of the classical control theory do not allow one to obtain acceptable results. In this regard, the possibility of solving this problem by reinforcement learning methods has been investigated, which allows designers to find control algorithms close to optimal ones as a result of interactions of the control system with the plant using a reinforcement signal characterizing the quality of control actions. The well-known quadratic criterion is used as a reinforcement signal, which makes it possible to take into account both the accuracy requirements and the control costs. A search for control actions based on reinforcement learning is made using the policy iteration algorithm. This algorithm is implemented using the actor–critic architecture. Various representations of the actor for control law implementation and the critic for obtaining value function estimates using neural network approximators are considered. It is shown that the optimal control approximation accuracy depends on a number of features, namely, an appropriate structure of the approximators, the neural network parameter updating method, and the learning algorithm parameters. The investigated approach makes it possible to solve the considered class of control problems for controllers of different structures. Moreover, the approach allows the control system to refine its control algorithms during the spacecraft operation.

Download Full-text

Application of Reinforcement Learning Algorithm Model in Gas Path Fault Intelligent Diagnosis of Gas Turbine

Computational Intelligence and Neuroscience ◽

10.1155/2021/3897077 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Yulong Luo

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Fault Diagnosis ◽

Gas Turbine ◽

Learning Algorithm ◽

Working Environment ◽

Intelligent Diagnosis ◽

Diagnosis Method ◽

Gas Path Fault ◽

Reinforcement Learning Algorithm

Gas turbine is widely used because of its advantages of fast start and stop, no pollution, and high thermal efficiency. However, the working environment of high temperature, high pressure, and high speed makes the gas turbine prone to failure. The traditional gas path fault intelligent diagnosis scheme of the gas turbine has the problems of poor control effect and low scheduling accuracy. Experiment studies the application of neural network and reinforcement learning algorithm in gas path fault intelligent diagnosis of the gas turbine. The accurate control of fault diagnosis planning is realized from gas path fault diagnosis, daily maintenance, service condition monitoring, power utilization rate, and other aspects of the gas turbine. The reinforcement learning model can realize the intelligent diagnosis and record of gas path fault of the gas turbine, to achieve diversified analysis and intelligent diagnosis scheme. Through neural network algorithm and deep learning technology, the whole process monitoring of the gas turbine is realized, and the failure rate of the gas turbine in the working process is reduced. The experimental results show that, compared with the thermal fault diagnosis method and the fault diagnosis method of the electric percussion drill, using thermal imaging, the gas turbine gas path fault intelligent diagnosis model based on the reinforcement learning algorithm can complete the data information in the process of real-time data transmission. The quantified conversion and processing of the system has the advantages of higher control accuracy and faster response speed, which can effectively improve the diagnostic efficiency and accuracy.

Download Full-text