scholarly journals Coordinated Learning by Model Difference Identification in Multiagent Systems with Sparse Interactions

2016 ◽  
Vol 2016 ◽  
pp. 1-17
Author(s):  
Qi Zhang ◽  
Peng Jiao ◽  
Quanjun Yin ◽  
Lin Sun

Multiagent Reinforcement Learning (MARL) is a promising technique for agents learning effective coordinated policy in Multiagent Systems (MASs). In many MASs, interactions between agents are usually sparse, and then a lot of MARL methods were devised for them. These methods divide learning process into independent learning and joint learning in coordinated states to improve traditional joint state-action space learning. However, most of those methods identify coordinated states based on assumptions about domain structure (e.g., dependencies) or agent (e.g., prior individual optimal policy and agent homogeneity). Moreover, situations that current methods cannot deal with still exist. In this paper, a modified approach is proposed to learn where and how to coordinate agents’ behaviors in more general MASs with sparse interactions. Our approach introduces sample grouping and a more accurate metric of model difference degree to identify which states of other agents should be considered in coordinated states, without strong additional assumptions. Experimental results show that the proposed approach outperforms its competitors by improving the average agent reward per step and works well in some broader scenarios.

2020 ◽  
Author(s):  
Felipe Leno Da Silva ◽  
Anna Helena Reali Costa

Reinforcement Learning (RL) is a powerful tool that has been used to solve increasingly complex tasks. RL operates through repeated interactions of the learning agent with the environment, via trial and error. However, this learning process is extremely slow, requiring many interactions. In this thesis, we leverage previous knowledge so as to accelerate learning in multiagent RL problems. We propose knowledge reuse both from previous tasks and from other agents. Several flexible methods are introduced so that each of these two types of knowledge reuse is possible. This thesis adds important steps towards more flexible and broadly applicable multiagent transfer learning methods.


2019 ◽  
Author(s):  
Jordão Memória ◽  
José Maia

In this work, a modeling and algorithm based on multiagent reinforcement learning is developed for the problem of elevator group dispatch. The main advantage is that, along with the function approximation, this multi-agent solution leads to reduction of the state space, allowing complex states to be addressed with a synthesizing evaluation function. Each elevator is considered an agent that have to decide about two actions: answer or ignore the new call. With some iterations, the agents learn the weights of an evaluation function which approximate the state-action value function. The performance of solution (average waiting time - AWT), shown varying the traffic pattern, flow of people, number of elevators and number of floors, is comparable to other current proposals reported in the literature.


2020 ◽  
Vol 34 (05) ◽  
pp. 7293-7300
Author(s):  
Weixun Wang ◽  
Tianpei Yang ◽  
Yong Liu ◽  
Jianye Hao ◽  
Xiaotian Hao ◽  
...  

A lot of efforts have been devoted to investigating how agents can learn effectively and achieve coordination in multiagent systems. However, it is still challenging in large-scale multiagent settings due to the complex dynamics between the environment and agents and the explosion of state-action space. In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents. We propose three transfer mechanisms across curricula to accelerate the learning process. Moreover, due to the fact that the state dimension varies across curricula, and existing network structures cannot be applied in such a transfer setting since their network input sizes are fixed. Therefore, we design a novel network structure called Dynamic Agent-number Network (DyAN) to handle the dynamic size of the network input. Experimental results show that DyMA-CL using DyAN greatly improves the performance of large-scale multiagent learning compared with state-of-the-art deep reinforcement learning approaches. We also investigate the influence of three transfer mechanisms across curricula through extensive simulations.


2021 ◽  
Vol 21 (4) ◽  
pp. 1-22
Author(s):  
Safa Otoum ◽  
Burak Kantarci ◽  
Hussein Mouftah

Volunteer computing uses Internet-connected devices (laptops, PCs, smart devices, etc.), in which their owners volunteer them as storage and computing power resources, has become an essential mechanism for resource management in numerous applications. The growth of the volume and variety of data traffic on the Internet leads to concerns on the robustness of cyberphysical systems especially for critical infrastructures. Therefore, the implementation of an efficient Intrusion Detection System for gathering such sensory data has gained vital importance. In this article, we present a comparative study of Artificial Intelligence (AI)-driven intrusion detection systems for wirelessly connected sensors that track crucial applications. Specifically, we present an in-depth analysis of the use of machine learning, deep learning and reinforcement learning solutions to recognise intrusive behavior in the collected traffic. We evaluate the proposed mechanisms by using KDD’99 as real attack dataset in our simulations. Results present the performance metrics for three different IDSs, namely the Adaptively Supervised and Clustered Hybrid IDS (ASCH-IDS), Restricted Boltzmann Machine-based Clustered IDS (RBC-IDS), and Q-learning based IDS (Q-IDS), to detect malicious behaviors. We also present the performance of different reinforcement learning techniques such as State-Action-Reward-State-Action Learning (SARSA) and the Temporal Difference learning (TD). Through simulations, we show that Q-IDS performs with detection rate while SARSA-IDS and TD-IDS perform at the order of .


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


Sign in / Sign up

Export Citation Format

Share Document