Learning a Swarm Foraging Behavior with Microscopic Fuzzy Controllers Using Deep Reinforcement Learning

Fidel Aznar; Mar Pujol; Ramón Rizo

doi:10.3390/app11062856

Learning a Swarm Foraging Behavior with Microscopic Fuzzy Controllers Using Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app11062856 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2856

Author(s):

Fidel Aznar ◽

Mar Pujol ◽

Ramón Rizo

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Foraging Behavior ◽

Target Position ◽

Light Detection And Ranging ◽

Complex Task ◽

Training Phase ◽

Learning Techniques ◽

Robotic Swarm ◽

Foraging Task

This article presents a macroscopic swarm foraging behavior obtained using deep reinforcement learning. The selected behavior is a complex task in which a group of simple agents must be directed towards an object to move it to a target position without the use of special gripping mechanisms, using only their own bodies. Our system has been designed to use and combine basic fuzzy behaviors to control obstacle avoidance and the low-level rendezvous processes needed for the foraging task. We use a realistically modeled swarm based on differential robots equipped with light detection and ranging (LiDAR) sensors. It is important to highlight that the obtained macroscopic behavior, in contrast to that of end-to-end systems, combines existing microscopic tasks, which allows us to apply these learning techniques even with the dimensionality and complexity of the problem in a realistic robotic swarm system. The presented behavior is capable of correctly developing the macroscopic foraging task in a robust and scalable way, even in situations that have not been seen in the training phase. An exhaustive analysis of the obtained behavior is carried out, where both the movement of the swarm while performing the task and the swarm scalability are analyzed.

Download Full-text

Generating collective foraging behavior for robotic swarm using deep reinforcement learning

Artificial Life and Robotics ◽

10.1007/s10015-020-00642-2 ◽

2020 ◽

Vol 25 (4) ◽

pp. 588-595

Author(s):

Boyin Jin ◽

Yupeng Liang ◽

Ziyao Han ◽

Kazuhiro Ohkura

Keyword(s):

Reinforcement Learning ◽

Foraging Behavior ◽

Robotic Swarm ◽

Collective Foraging

Download Full-text

Optimal Policies for Quantum Markov Decision Processes

International Journal of Automation and Computing ◽

10.1007/s11633-021-1278-z ◽

2021 ◽

Author(s):

Ming-Sheng Ying ◽

Yuan Feng ◽

Sheng-Gang Ying

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Quantum Systems ◽

Sequential Decision Making ◽

Mathematical Framework ◽

Sequential Decision ◽

Learning Techniques ◽

Optimal Policies ◽

Markov Decision ◽

Programming Algorithms

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.

Download Full-text

Training a simulated bat: Modeling sonar-based obstacle avoidance using deep-reinforcement learning

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308555 ◽

2020 ◽

Author(s):

Adithya Venkatesh Mohan ◽

Dieter Vanderelst

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance

Download Full-text

Depth-based Obstacle Avoidance through Deep Reinforcement Learning

Proceedings of the 5th International Conference on Mechatronics and Robotics Engineering - ICMRE'19 ◽

10.1145/3314493.3314495 ◽

2019 ◽

Cited By ~ 1

Author(s):

Keyu Wu ◽

Mahdi Abolfazli Esfahani ◽

Shenghai Yuan ◽

Han Wang

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance

Download Full-text

Response threshold-based task allocation in a reinforcement learning robotic swarm

2014 IEEE 7th International Workshop on Computational Intelligence and Applications (IWCIA) ◽

10.1109/iwcia.2014.6988104 ◽

2014 ◽

Cited By ~ 4

Author(s):

Toshiyuki Yasuda ◽

Koki Kage ◽

Kazuhiro Ohkura

Keyword(s):

Reinforcement Learning ◽

Task Allocation ◽

Response Threshold ◽

Robotic Swarm

Download Full-text

A Survey of Applying Reinforcement Learning Techniques to Multicast Routing

2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) ◽

10.1109/uemcon47517.2019.8993014 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ola Ashour ◽

Marc St-Hilaire ◽

Thomas Kunz ◽

Maoyu Wang

Keyword(s):

Reinforcement Learning ◽

Multicast Routing ◽

Learning Techniques

Download Full-text

Optimizing time warp simulation with reinforcement learning techniques

2007 Winter Simulation Conference ◽

10.1109/wsc.2007.4419650 ◽

2007 ◽

Cited By ~ 9

Author(s):

Jun Wang ◽

Carl Tropper

Keyword(s):

Reinforcement Learning ◽

Time Warp ◽

Learning Techniques

Download Full-text

A Multi-Step Neural Control for Motor Brain-Machine Interface by Reinforcement Learning

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.461.565 ◽

2013 ◽

Vol 461 ◽

pp. 565-569 ◽

Cited By ~ 2

Author(s):

Fang Wang ◽

Kai Xu ◽

Qiao Sheng Zhang ◽

Yi Wen Wang ◽

Xiao Xiang Zheng

Keyword(s):

Reinforcement Learning ◽

Neural Activity ◽

Neural Control ◽

Brain Machine Interface ◽

Learning Approach ◽

Complex Task ◽

Neural Data ◽

Neural Spikes ◽

Performance Improvements ◽

Machine Interface

Brain-machine interfaces (BMIs) decode cortical neural spikes of paralyzed patients to control external devices for the purpose of movement restoration. Neuroplasticity induced by conducting a relatively complex task within multistep, is helpful to performance improvements of BMI system. Reinforcement learning (RL) allows the BMI system to interact with the environment to learn the task adaptively without a teacher signal, which is more appropriate to the case for paralyzed patients. In this work, we proposed to apply Q(λ)-learning to multistep goal-directed tasks using users neural activity. Neural data were recorded from M1 of a monkey manipulating a joystick in a center-out task. Compared with a supervised learning approach, significant BMI control was achieved with correct directional decoding in 84.2% and 81% of the trials from naïve states. The results demonstrate that the BMI system was able to complete a task by interacting with the environment, indicating that RL-based methods have the potential to develop more natural BMI systems.

Download Full-text

On collaborative reinforcement learning to optimize the redistribution of critical medical supplies throughout the COVID-19 pandemic

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa324 ◽

2020 ◽

Author(s):

Bryan P Bednarski ◽

Akash Deep Singh ◽

William M Jones

Keyword(s):

Public Health ◽

Reinforcement Learning ◽

Medical Equipment ◽

Census Bureau ◽

Learning Models ◽

Public Health Emergencies ◽

Medical Supplies ◽

Learning Techniques ◽

Disease Impact ◽

Random States

Abstract objective This work investigates how reinforcement learning and deep learning models can facilitate the near-optimal redistribution of medical equipment in order to bolster public health responses to future crises similar to the COVID-19 pandemic. materials and methods The system presented is simulated with disease impact statistics from the Institute of Health Metrics (IHME), Center for Disease Control, and Census Bureau[1, 2, 3]. We present a robust pipeline for data preprocessing, future demand inference, and a redistribution algorithm that can be adopted across broad scales and applications. results The reinforcement learning redistribution algorithm demonstrates performance optimality ranging from 93-95%. Performance improves consistently with the number of random states participating in exchange, demonstrating average shortage reductions of 78.74% (± 30.8) in simulations with 5 states to 93.50% (± 0.003) with 50 states. conclusion These findings bolster confidence that reinforcement learning techniques can reliably guide resource allocation for future public health emergencies.

Download Full-text

Autonomous Surface Vessel Obstacle Avoidance Based on Hierarchical Reinforcement Learning With Potential Field Method

10.1115/1.0000710v ◽

2021 ◽

Author(s):

Chang Zhou ◽

Lei Wang ◽

Huacheng He ◽

Shangyu Yu

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Potential Field ◽

Field Method ◽

Hierarchical Reinforcement Learning ◽

Potential Field Method ◽

Surface Vessel

Download Full-text