Robot Navigation in Crowd Based on Dual Social Attention Deep Reinforcement Learning

Mathematical Problems in Engineering ◽

10.1155/2021/7114981 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Hui Zeng ◽

Rong Hu ◽

Xiaohui Huang ◽

Zhiying Peng

Keyword(s):

Reinforcement Learning ◽

Social Attention ◽

Human Robot Interaction ◽

Human Interaction ◽

Target Point ◽

Movement Trajectory ◽

Learning Techniques ◽

Starting Point ◽

Dense Crowds ◽

Short Time

Finding a feasible, collision-free path in line with social activities is an important and challenging task for robots working in dense crowds. In recent years, many studies have used deep reinforcement learning techniques to solve this problem. In particular, it is necessary to find an efficient path in a short time which often requires predicting the interaction with neighboring agents. However, as the crowd grows and the scene becomes more and more complex, researchers usually simplify the problem to a one-way human-robot interaction problem. But, in fact, we have to consider not only the interaction between humans and robots but also the influence of human-human interactions on the movement trajectory of the robot. Therefore, this article proposes a method based on deep reinforcement learning to enable the robot to avoid obstacles in the crowd and navigate smoothly from the starting point to the target point. We use a dual social attention mechanism to jointly model human-robot and human-human interaction. All sorts of experiments demonstrate that our model can make robots navigate in dense crowds more efficiently compared with other algorithms.

Download Full-text

The role and relationship of mindreading and social attunement in HRI – position statements of interdisciplinary researchers. Workshop HRI'20

10.31234/osf.io/3p6cd ◽

2020 ◽

Author(s):

Agnieszka Wykowska ◽

Jairo Pérez-Osorio ◽

Stefan Kopp

Keyword(s):

Mental States ◽

Human Robot Interaction ◽

Human Interaction ◽

Artificial Agents ◽

The Novel ◽

Robot Interaction ◽

Novel Coronavirus ◽

Relationship Of ◽

The Relationship

This booklet is a collection of the position statements accepted for the HRI’20 conference workshop “Social Cognition for HRI: Exploring the relationship between mindreading and social attunement in human-robot interaction” (Wykowska, Perez-Osorio & Kopp, 2020). Unfortunately, due to the rapid unfolding of the novel coronavirus at the beginning of the present year, the conference and consequently our workshop, were canceled. On the light of these events, we decided to put together the positions statements accepted for the workshop. The contributions collected in these pages highlight the role of attribution of mental states to artificial agents in human-robot interaction, and precisely the quality and presence of social attunement mechanisms that are known to make human interaction smooth, efficient, and robust. These papers also accentuate the importance of the multidisciplinary approach to advance the understanding of the factors and the consequences of social interactions with artificial agents.

Download Full-text

An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users

Biomimetics ◽

10.3390/biomimetics6010013 ◽

2021 ◽

Vol 6 (1) ◽

pp. 13

Author(s):

Adam Bignold ◽

Francisco Cruz ◽

Richard Dazeley ◽

Peter Vamplew ◽

Cameron Foale

Keyword(s):

Reinforcement Learning ◽

Information Source ◽

Human Interaction ◽

Evaluation Methodology ◽

External Information ◽

Preliminary Evaluation ◽

Learning Agents ◽

Learning Agent ◽

Knowledge Bias ◽

The Impact

Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent.

Download Full-text

Optimal Policies for Quantum Markov Decision Processes

International Journal of Automation and Computing ◽

10.1007/s11633-021-1278-z ◽

2021 ◽

Author(s):

Ming-Sheng Ying ◽

Yuan Feng ◽

Sheng-Gang Ying

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Quantum Systems ◽

Sequential Decision Making ◽

Mathematical Framework ◽

Sequential Decision ◽

Learning Techniques ◽

Optimal Policies ◽

Markov Decision ◽

Programming Algorithms

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.

Download Full-text

Reinforcement Learning Approaches in Social Robotics

Sensors ◽

10.3390/s21041292 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1292

Author(s):

Neziha Akalin ◽

Amy Loutfi

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Social Robotics ◽

Research Field ◽

Social Robots ◽

Learning Approaches ◽

Reward Function ◽

Optimal Behavior ◽

Learning Challenges ◽

Starting Point

This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field.

Download Full-text

A Survey of Applying Reinforcement Learning Techniques to Multicast Routing

2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) ◽

10.1109/uemcon47517.2019.8993014 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ola Ashour ◽

Marc St-Hilaire ◽

Thomas Kunz ◽

Maoyu Wang

Keyword(s):

Reinforcement Learning ◽

Multicast Routing ◽

Learning Techniques

Download Full-text

Optimizing time warp simulation with reinforcement learning techniques

2007 Winter Simulation Conference ◽

10.1109/wsc.2007.4419650 ◽

2007 ◽

Cited By ~ 9

Author(s):

Jun Wang ◽

Carl Tropper

Keyword(s):

Reinforcement Learning ◽

Time Warp ◽

Learning Techniques

Download Full-text

How does the robot feel? Perception of valence and arousal in emotional body language

Paladyn Journal of Behavioral Robotics ◽

10.1515/pjbr-2018-0012 ◽

2018 ◽

Vol 9 (1) ◽

pp. 168-182 ◽

Cited By ~ 6

Author(s):

Mina Marmpena ◽

Angelica Lim ◽

Torbjørn S. Dahl

Keyword(s):

Human Perception ◽

Ground Truth ◽

Body Language ◽

Human Robot Interaction ◽

Behavioral Experiments ◽

Robot Interaction ◽

Expression Of Emotion ◽

Exploratory Approach ◽

Starting Point ◽

Valence And Arousal

Abstract Human-robot interaction in social robotics applications could be greatly enhanced by robotic behaviors that incorporate emotional body language. Using as our starting point a set of pre-designed, emotion conveying animations that have been created by professional animators for the Pepper robot, we seek to explore how humans perceive their affect content, and to increase their usability by annotating them with reliable labels of valence and arousal, in a continuous interval space. We conducted an experiment with 20 participants who were presented with the animations and rated them in the two-dimensional affect space. An inter-rater reliability analysis was applied to support the aggregation of the ratings for deriving the final labels. The set of emotional body language animations with the labels of valence and arousal is available and can potentially be useful to other researchers as a ground truth for behavioral experiments on robotic expression of emotion, or for the automatic selection of robotic emotional behaviors with respect to valence and arousal. To further utilize the data we collected, we analyzed it with an exploratory approach and we present some interesting trends with regard to the human perception of Pepper’s emotional body language, that might be worth further investigation.

Download Full-text

On collaborative reinforcement learning to optimize the redistribution of critical medical supplies throughout the COVID-19 pandemic

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa324 ◽

2020 ◽

Author(s):

Bryan P Bednarski ◽

Akash Deep Singh ◽

William M Jones

Keyword(s):

Public Health ◽

Reinforcement Learning ◽

Medical Equipment ◽

Census Bureau ◽

Learning Models ◽

Public Health Emergencies ◽

Medical Supplies ◽

Learning Techniques ◽

Disease Impact ◽

Random States

Abstract objective This work investigates how reinforcement learning and deep learning models can facilitate the near-optimal redistribution of medical equipment in order to bolster public health responses to future crises similar to the COVID-19 pandemic. materials and methods The system presented is simulated with disease impact statistics from the Institute of Health Metrics (IHME), Center for Disease Control, and Census Bureau[1, 2, 3]. We present a robust pipeline for data preprocessing, future demand inference, and a redistribution algorithm that can be adopted across broad scales and applications. results The reinforcement learning redistribution algorithm demonstrates performance optimality ranging from 93-95%. Performance improves consistently with the number of random states participating in exchange, demonstrating average shortage reductions of 78.74% (± 30.8) in simulations with 5 states to 93.50% (± 0.003) with 50 states. conclusion These findings bolster confidence that reinforcement learning techniques can reliably guide resource allocation for future public health emergencies.

Download Full-text

Path Planning for Spheres in Three Dimensional Environments With Low Interference Index

20th Design Automation Conference: Volume 1 — Dynamic Mechanical Systems; Geometric Modeling and Features; Concurrent Engineering ◽

10.1115/detc1994-0041 ◽

1994 ◽

Author(s):

Duane W. Storti ◽

Debasish Dutta

Keyword(s):

Path Planning ◽

Free Path ◽

Configuration Space ◽

Local Knowledge ◽

Three Dimensional ◽

Target Point ◽

Planning Problem ◽

Spherical Object ◽

Starting Point ◽

Path Planning Problem

Abstract We consider the path planning problem for a spherical object moving through a three-dimensional environment composed of spherical obstacles. Given a starting point and a terminal or target point, we wish to determine a collision free path from start to target for the moving sphere. We define an interference index to count the number of configuration space obstacles whose surfaces interfere simultaneously. In this paper, we present algorithms for navigating the sphere when the interference index is ≤ 2. While a global calculation is necessary to characterize the environment as a whole, only local knowledge is needed for path construction.

Download Full-text

Adaptive Look-ahead distance for Pure Pursuit Controller with Deep Reinforcement Learning Techniques

10.1145/3478586.3478600 ◽

2021 ◽

Author(s):

Aakarsh Goel ◽

Shubham Chauhan

Keyword(s):

Reinforcement Learning ◽

Look Ahead ◽

Learning Techniques ◽

Pure Pursuit

Download Full-text