An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning

Kaifang Wan; Dingwei Wu; Yiwei Zhai; Bo Li; Xiaoguang Gao; Zijian Hu

doi:10.3390/e23111433

An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning

Entropy ◽

10.3390/e23111433 ◽

2021 ◽

Vol 23 (11) ◽

pp. 1433

Author(s):

Kaifang Wan ◽

Dingwei Wu ◽

Yiwei Zhai ◽

Bo Li ◽

Xiaoguang Gao ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Superior Performance ◽

State Variables ◽

Multi Agent Systems ◽

Adversarial Learning ◽

Pursuit Evasion ◽

Evasion Game ◽

Multi Agent ◽

Adversarial Attack

A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.

Download Full-text

A Novel Heterogeneous Swarm Reinforcement Learning Method for Sequential Decision Making Problems

Machine Learning and Knowledge Extraction ◽

10.3390/make1020035 ◽

2019 ◽

Vol 1 (2) ◽

pp. 590-610

Author(s):

Zohreh Akbari ◽

Rainer Unland

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Single Agent ◽

Sequential Decision Making ◽

Multi Agent Systems ◽

Sequential Decision ◽

Agent Systems ◽

Novel Approach ◽

Markov Decision ◽

Multi Agent

Sequential Decision Making Problems (SDMPs) that can be modeled as Markov Decision Processes can be solved using methods that combine Dynamic Programming (DP) and Reinforcement Learning (RL). Depending on the problem scenarios and the available Decision Makers (DMs), such RL algorithms may be designed for single-agent systems or multi-agent systems that either consist of agents with individual goals and decision making capabilities, which are influenced by other agent’s decisions, or behave as a swarm of agents that collaboratively learn a single objective. Many studies have been conducted in this area; however, when concentrating on available swarm RL algorithms, one obtains a clear view of the areas that still require attention. Most of the studies in this area focus on homogeneous swarms and so far, systems introduced as Heterogeneous Swarms (HetSs) merely include very few, i.e., two or three sub-swarms of homogeneous agents, which either, according to their capabilities, deal with a specific sub-problem of the general problem or exhibit different behaviors in order to reduce the risk of bias. This study introduces a novel approach that allows agents, which are originally designed to solve different problems and hence have higher degrees of heterogeneity, to behave as a swarm when addressing identical sub-problems. In fact, the affinity between two agents, which measures the compatibility of agents to work together towards solving a specific sub-problem, is used in designing a Heterogeneous Swarm RL (HetSRL) algorithm that allows HetSs to solve the intended SDMPs.

Download Full-text

Multi-agent Deep Reinforcement Learning for Pursuit-Evasion Game Scalability

Lecture Notes in Electrical Engineering - Proceedings of 2019 Chinese Intelligent Systems Conference ◽

10.1007/978-981-32-9682-4_69 ◽

2019 ◽

pp. 658-669

Author(s):

Lin Xu ◽

Bin Hu ◽

Zhihong Guan ◽

Xinming Cheng ◽

Tao Li ◽

...

Keyword(s):

Reinforcement Learning ◽

Pursuit Evasion ◽

Evasion Game ◽

Multi Agent

Download Full-text

A Novel Method Combining Leader-Following Control and Reinforcement Learning for Pursuit Evasion Games of Multi-Agent Systems

2020 16th International Conference on Control, Automation, Robotics and Vision (ICARCV) ◽

10.1109/icarcv50220.2020.9305441 ◽

2020 ◽

Author(s):

Zhe-Yang Zhu ◽

Cheng-Lin Liu

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Agent Systems ◽

Pursuit Evasion ◽

Leader Following ◽

Multi Agent ◽

Novel Method

Download Full-text

Output feedback reinforcement learning based optimal output synchronisation of heterogeneous discrete-time multi-agent systems

IET Control Theory and Applications ◽

10.1049/iet-cta.2018.6266 ◽

2019 ◽

Vol 13 (17) ◽

pp. 2866-2876

Author(s):

Syed Ali Asad Rizvi ◽

Zongli Lin

Keyword(s):

Reinforcement Learning ◽

Discrete Time ◽

Output Feedback ◽

Multi Agent Systems ◽

Agent Systems ◽

Optimal Output ◽

Multi Agent

Download Full-text

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

Symmetry ◽

10.3390/sym12040631 ◽

2020 ◽

Vol 12 (4) ◽

pp. 631

Author(s):

Chunyang Hu

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Knowledge Transfer ◽

Large Scale ◽

Effective Control ◽

Small Scale ◽

Learning Agent ◽

Multi Agent ◽

Transfer Method ◽

Parameter Sharing

In this paper, deep reinforcement learning (DRL) and knowledge transfer are used to achieve the effective control of the learning agent for the confrontation in the multi-agent systems. Firstly, a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm with parameter sharing is proposed to achieve confrontation decision-making of multi-agent. In the process of training, the information of other agents is introduced to the critic network to improve the strategy of confrontation. The parameter sharing mechanism can reduce the loss of experience storage. In the DDPG algorithm, we use four neural networks to generate real-time action and Q-value function respectively and use a momentum mechanism to optimize the training process to accelerate the convergence rate for the neural network. Secondly, this paper introduces an auxiliary controller using a policy-based reinforcement learning (RL) method to achieve the assistant decision-making for the game agent. In addition, an effective reward function is used to help agents balance losses of enemies and our side. Furthermore, this paper also uses the knowledge transfer method to extend the learning model to more complex scenes and improve the generalization of the proposed confrontation model. Two confrontation decision-making experiments are designed to verify the effectiveness of the proposed method. In a small-scale task scenario, the trained agent can successfully learn to fight with the competitors and achieve a good winning rate. For large-scale confrontation scenarios, the knowledge transfer method can gradually improve the decision-making level of the learning agent.

Download Full-text

A novel optimal bipartite consensus control scheme for unknown multi-agent systems via model-free reinforcement learning

Applied Mathematics and Computation ◽

10.1016/j.amc.2019.124821 ◽

2020 ◽

Vol 369 ◽

pp. 124821 ◽

Cited By ~ 10

Author(s):

Zhinan Peng ◽

Jiangping Hu ◽

Kaibo Shi ◽

Rui Luo ◽

Rui Huang ◽

...

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Consensus Control ◽

Agent Systems ◽

Model Free ◽

Control Scheme ◽

Multi Agent ◽

Bipartite Consensus

Download Full-text

Formation Control using Simplified Reinforcement Learning for Multi-agent systems with State Delay

10.23919/ccc52363.2021.9549357 ◽

2021 ◽

Author(s):

Wentai Shao ◽

Yutao Chen ◽

Jie Huang

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Multi Agent Systems ◽

State Delay ◽

Agent Systems ◽

Multi Agent

Download Full-text

Optimized Backstepping Consensus Control Using Reinforcement Learning for a Class of Nonlinear Strict-Feedback-Dynamic Multi-Agent Systems

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2021.3105548 ◽

2021 ◽

pp. 1-13

Author(s):

Guoxing Wen ◽

C. L. Philip Chen

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Consensus Control ◽

Agent Systems ◽

Multi Agent ◽

Strict Feedback

Download Full-text

Optimal robust formation control for heterogeneous multi‐agent systems based on reinforcement learning

International Journal of Robust and Nonlinear Control ◽

10.1002/rnc.5828 ◽

2021 ◽

Author(s):

Bing Yan ◽

Peng Shi ◽

Cheng‐Chew Lim ◽

Zhiyuan Shi

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Multi Agent Systems ◽

Agent Systems ◽

Multi Agent

Download Full-text

Improvement on Supporting Machine Learning Algorithm for Solving Problem in Immediate Decision Making

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.566.572 ◽

2012 ◽

Vol 566 ◽

pp. 572-579

Author(s):

Abdolkarim Niazi ◽

Norizah Redzuan ◽

Raja Ishak Raja Hamzah ◽

Sara Esfandiari

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Multi Agent Systems ◽

Combined Model ◽

Q Learning ◽

Agent Systems ◽

Multi Agent ◽

Case Base ◽

Case Base Reasoning ◽

Robotic Tool

In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.

Download Full-text