A MULTI-AGENT APPROACH TO POMDPS USING OFF-POLICY REINFORCEMENT LEARNING AND GENETIC ALGORITHMS

International Journal of Computing ◽

10.47839/ijc.19.3.1887 ◽

2020 ◽

pp. 377-386

Author(s):

Samuel Obadan ◽

Zenghui Wang

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Feedforward Neural Networks ◽

Ground Truth ◽

Estimation Accuracy ◽

Offline Learning ◽

Markov Decision ◽

Multi Agent ◽

The Impact

This paper introduces novel concepts for accelerating learning in an off-policy reinforcement learning algorithm for Partially Observable Markov Decision Processes (POMDP) by leveraging multiple agents frame work. Reinforcement learning (RL) algorithm is considerably a slow but elegant approach to learning in an unknown environment. Although the action-value (Q-learning) is faster than the state-value, the rate of convergence to an optimal policy or maximum cumulative reward remains a constraint. Consequently, in an attempt to optimize the learning phase of an RL problem within POMD environment, we present two multi-agent learning paradigms: the multi-agent off-policy reinforcement learning and an ingenious GA (genetic Algorithm) approach for multi-agent offline learning using feedforward neural networks. At the end of the trainings (episodes and epochs) for reinforcement learning and genetic algorithm respectively, we compare the convergence rate for both algorithms with respect to creating the underlying MDPs for POMDP problems. Finally, we demonstrate the impact of layered resampling of Monte CarloвЂ™s particle filter for improving the belief state estimation accuracy with respect to ground truth within POMDP domains. Initial empirical results suggest practicable solutions.

Download Full-text

Computational Design of Modular Robots Based on Genetic Algorithm and Reinforcement Learning

Symmetry ◽

10.3390/sym13030471 ◽

2021 ◽

Vol 13 (3) ◽

pp. 471

Author(s):

Jai Hoon Park ◽

Kang Hoon Lee

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Design Space ◽

Learning Algorithm ◽

Computational Design ◽

Computational Method ◽

Learning Ability ◽

Modular Robots ◽

Control Mechanisms ◽

Candidate Structure

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.

Download Full-text

A multi-agent reinforcement learning algorithm with fuzzy approximation for Distributed Stochastic Unit Commitment

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-182879 ◽

2019 ◽

Vol 37 (5) ◽

pp. 6613-6628

Author(s):

Ghorbani Farzaneh ◽

Afsharchi Mohsen ◽

Derhami Vali

Keyword(s):

Reinforcement Learning ◽

Unit Commitment ◽

Learning Algorithm ◽

Fuzzy Approximation ◽

Multi Agent ◽

Stochastic Unit Commitment ◽

Reinforcement Learning Algorithm

Download Full-text

Improvement on Supporting Machine Learning Algorithm for Solving Problem in Immediate Decision Making

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.566.572 ◽

2012 ◽

Vol 566 ◽

pp. 572-579

Author(s):

Abdolkarim Niazi ◽

Norizah Redzuan ◽

Raja Ishak Raja Hamzah ◽

Sara Esfandiari

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Multi Agent Systems ◽

Combined Model ◽

Q Learning ◽

Agent Systems ◽

Multi Agent ◽

Case Base ◽

Case Base Reasoning ◽

Robotic Tool

In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.

Download Full-text

A reinforcement learning algorithm with fuzzy approximation for semi Markov decision problems

Journal of Intelligent & Fuzzy Systems ◽

10.3233/ifs-141460 ◽

2015 ◽

Vol 28 (4) ◽

pp. 1733-1744 ◽

Cited By ~ 1

Author(s):

Ufuk Kula ◽

Beyazıt Ocaktan

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Decision Problems ◽

Fuzzy Approximation ◽

Markov Decision Problems ◽

Markov Decision ◽

Reinforcement Learning Algorithm

Download Full-text

A Novel Distributed Multi-Agent Reinforcement Learning Algorithm Against Jamming Attacks

IEEE Communications Letters ◽

10.1109/lcomm.2021.3097290 ◽

2021 ◽

Vol 25 (10) ◽

pp. 3204-3208

Author(s):

Ibrahim Elleuch ◽

Ali Pourranjbar ◽

Georges Kaddoum

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Jamming Attacks ◽

Multi Agent ◽

Reinforcement Learning Algorithm

Download Full-text

Q Value Reinforcement Learning Algorithm Based on Multi Agent System

Journal of Physics Conference Series ◽

10.1088/1742-6596/1069/1/012094 ◽

2018 ◽

Vol 1069 ◽

pp. 012094 ◽

Cited By ~ 1

Author(s):

Xijie Yin ◽

Dongxin Yang

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Multi Agent System ◽

Q Value ◽

Agent System ◽

Multi Agent ◽

Reinforcement Learning Algorithm

Download Full-text

A Multi-Step Reinforcement Learning Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.44-47.3611 ◽

2010 ◽

Vol 44-47 ◽

pp. 3611-3615 ◽

Cited By ~ 1

Author(s):

Zhi Cong Zhang ◽

Kai Shun Hu ◽

Hui Yu Huang ◽

Shuai Li ◽

Shao Yong Zhao

Keyword(s):

Reinforcement Learning ◽

Markov Decision Process ◽

Decision Process ◽

Large Scale ◽

Learning Algorithm ◽

Machine Learning Method ◽

Learning Method ◽

K Value ◽

Markov Decision ◽

Action Value

Reinforcement learning (RL) is a state or action value based machine learning method which approximately solves large-scale Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP). A multi-step RL algorithm called Sarsa(,k) is proposed, which is a compromised variation of Sarsa and Sarsa(). It is equivalent to Sarsa if k is 1 and is equivalent to Sarsa() if k is infinite. Sarsa(,k) adjust its performance by setting k value. Two forms of Sarsa(,k), forward view Sarsa(,k) and backward view Sarsa(,k), are constructed and proved equivalent in off-line updating.

Download Full-text

An Enhanced Model-Free Reinforcement Learning Algorithm to Solve Nash Equilibrium for Multi-Agent Cooperative Game Systems

IEEE Access ◽

10.1109/access.2020.3043806 ◽

2020 ◽

Vol 8 ◽

pp. 223743-223755

Author(s):

Yuannan Jiang ◽

Fuxiao Tan

Keyword(s):

Reinforcement Learning ◽

Nash Equilibrium ◽

Cooperative Game ◽

Learning Algorithm ◽

Model Free ◽

Multi Agent ◽

Reinforcement Learning Algorithm

Download Full-text

Reinforcement Learning Applied to a Differential Game

Adaptive Behavior ◽

10.1177/105971239500400102 ◽

1995 ◽

Vol 4 (1) ◽

pp. 3-28 ◽

Cited By ~ 15

Author(s):

Mance E. Harmon ◽

Leemon C. Baird ◽

A. Harry Klopf

Keyword(s):

Reinforcement Learning ◽

Differential Game ◽

Learning Algorithm ◽

Learning System ◽

Test Bed ◽

Linear Quadratic ◽

Time Step ◽

Q Learning ◽

Step Duration ◽

Markov Decision

An application of reinforcement learning to a linear-quadratic, differential game is presented. The reinforcement learning system uses a recently developed algorithm, the residual-gradient form of advantage updating. The game is a Markov decision process with continuous time, states, and actions, linear dynamics, and a quadratic cost function. The game consists of two players, a missile and a plane; the missile pursues the plane and the plane evades the missile. Although a missile and plane scenario was the chosen test bed, the reinforcement learning approach presented here is equally applicable to biologically based systems, such as a predator pursuing prey. The reinforcement learning algorithm for optimal control is modified for differential games to find the minimax point rather than the maximum. Simulation results are compared to the analytical solution, demonstrating that the simulated reinforcement learning system converges to the optimal answer. The performance of both the residual-gradient and non-residual-gradient forms of advantage updating and Q-learning are compared, demonstrating that advantage updating converges faster than Q-learning in all simulations. Advantage updating also is demonstrated to converge regardless of the time step duration; Q-learning is unable to converge as the time step duration grows small.

Download Full-text

Study on Multi-agent Simulation System Based on Reinforcement Learning Algorithm

2009 WRI World Congress on Computer Science and Information Engineering ◽

10.1109/csie.2009.234 ◽

2009 ◽

Cited By ~ 1

Author(s):

Shu Da Wang ◽

Shuo Ning Wang ◽

Wei Ping Zhang

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Simulation System ◽

Agent Simulation ◽

Multi Agent ◽

Reinforcement Learning Algorithm

Download Full-text