Self-Partitioning State Space for Behavior Acquisition of Vision-Based Mobile Robots

2001 ◽  
Vol 13 (6) ◽  
pp. 625-636 ◽  
Author(s):  
Takayuki Nakamura ◽  
◽  
Tsukasa Ogasawara

An input generalization problem is one of the most important in applying reinforcement learning to real robot tasks. To cope with this problem, we propose a self-partitioning state space algorithm, which can make nonuniform quantization of state space. To show that our algorithm has generalization capability, we apply our method to two tasks in which a soccer robot shoots a ball into a goal and prevent a ball from entering a goal. To show the validity of this method, the experimental results for computer simulation and a real robot are shown.

2011 ◽  
Vol 216 ◽  
pp. 75-80 ◽  
Author(s):  
Chang An Liu ◽  
Fei Liu ◽  
Chun Yang Liu ◽  
Hua Wu

To solve the curse of dimensionality problem in multi-agent reinforcement learning, a learning method based on k-means is presented in this paper. In this method, the environmental state is represented as key state factors. The state space explosion is avoided by classifying states into different clusters using k-means. The learning rate is improved by assigning different states to existent clusters, as well as corresponding strategy. Compared to traditional Q-learning, our experimental results of the multi-robot cooperation show that our scheme improves the team learning ability efficiently. Meanwhile, the cooperation efficiency can be enhanced successfully.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Yong Song ◽  
Yibin Li ◽  
Xiaoli Wang ◽  
Xin Ma ◽  
Jiuhong Ruan

Reinforcement learning algorithm for multirobot will become very slow when the number of robots is increasing resulting in an exponential increase of state space. A sequentialQ-learning based on knowledge sharing is presented. The rule repository of robots behaviors is firstly initialized in the process of reinforcement learning. Mobile robots obtain present environmental state by sensors. Then the state will be matched to determine if the relevant behavior rule has been stored in the database. If the rule is present, an action will be chosen in accordance with the knowledge and the rules, and the matching weight will be refined. Otherwise the new rule will be appended to the database. The robots learn according to a given sequence and share the behavior database. We examine the algorithm by multirobot following-surrounding behavior, and find that the improved algorithm can effectively accelerate the convergence speed.


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199262
Author(s):  
Matej Dobrevski ◽  
Danijel Skočaj

Mobile robots that operate in real-world environments need to be able to safely navigate their surroundings. Obstacle avoidance and path planning are crucial capabilities for achieving autonomy of such systems. However, for new or dynamic environments, navigation methods that rely on an explicit map of the environment can be impractical or even impossible to use. We present a new local navigation method for steering the robot to global goals without relying on an explicit map of the environment. The proposed navigation model is trained in a deep reinforcement learning framework based on Advantage Actor–Critic method and is able to directly translate robot observations to movement commands. We evaluate and compare the proposed navigation method with standard map-based approaches on several navigation scenarios in simulation and demonstrate that our method is able to navigate the robot also without the map or when the map gets corrupted, while the standard approaches fail. We also show that our method can be directly transferred to a real robot.


2012 ◽  
Vol 588-589 ◽  
pp. 1515-1518
Author(s):  
Yong Song ◽  
Bing Liu ◽  
Yi Bin Li

Reinforcement learning algorithm for multi-robot may will become very slow when the number of robots is increasing resulting in an exponential increase of state space. A sequential Q-learning base on knowledge sharing is presented. The rule repository of robots behaviors is firstly initialized in the process of reinforcement learning. Mobile robots obtain present environmental state by sensors. Then the state will be matched to determine if the relevant behavior rule has been stored in database. If the rule is present, an action will be chosen in accordance with the knowledge and the rules, and the matching weight will be refined. Otherwise the new rule will be joined in the database. The robots learn according to a given sequence and share the behavior database. We examine the algorithm by multi-robot following-surrounding behavior, and find that the improved algorithm can effectively accelerate the convergence speed.


Symmetry ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 461 ◽  
Author(s):  
David Luviano-Cruz ◽  
Francesco Garcia-Luna ◽  
Luis Pérez-Domínguez ◽  
S. Gadi

A multi-agent system (MAS) is suitable for addressing tasks in a variety of domains without any programmed behaviors, which makes it ideal for the problems associated with the mobile robots. Reinforcement learning (RL) is a successful approach used in the MASs to acquire new behaviors; most of these select exact Q-values in small discrete state space and action space. This article presents a joint Q-function linearly fuzzified for a MAS’ continuous state space, which overcomes the dimensionality problem. Also, this article gives a proof for the convergence and existence of the solution proposed by the algorithm presented. This article also discusses the numerical simulations and experimental results that were carried out to validate the proposed algorithm.


2012 ◽  
Vol 182-183 ◽  
pp. 1751-1755
Author(s):  
Xi Feng Zheng ◽  
Feng Chang

For the purposes of correcting the LED display image, a method based on computer simulation is proposed. First, the development of the LED display panel is introduced. Second, analyze the causes of the problem which image in LED display panel has serious high non-uniformity, and introduce the existed correction techniques which are used to reduce the non-uniformity of LED display image. Simultaneously, point out the ground for shortcomings of these techniques. Third, describe the principle of correction method based on computer simulation detail from two steps, which are the luminous collection and luminous copulation. Forth, describe the realization steps of this method in accordance with the third step. Finally, this method is supplied in a LED display panel, whose resolution is 640×480. Experimental results show that this method is able to reduce the non-uniformity of images from 11.06% to 0.98%..


Sign in / Sign up

Export Citation Format

Share Document