Experimental study of the eligibility traces in complex valued reinforcement learning

Author(s):  
Takeshi Shibuya ◽  
Shingo Shimada ◽  
Tomoki Hamagami
Author(s):  
Masashi SUGIMOTO ◽  
Shunsuke INADA ◽  
Haruka MATSUFUJI ◽  
Shiro URUSHIHARA ◽  
Kazunori HOSOTANI ◽  
...  

2021 ◽  
Author(s):  
André Quadros ◽  
Roberto Xavier Junior ◽  
Kleber Souza ◽  
Bruno Gomes ◽  
Filipe Saraiva ◽  
...  

Reinforcement learning has evolved in recent years,and overcoming challenges found in this field. This area, unlikeconventional machine learning, does not learn through a setof observational instances, but through interaction with anenvironment. The sampling efficiency of a reinforcement learningagent is a challenge. That is, how to make an agent learn withinan environment with as little interaction as possible. In this workwe perform an experimental study on the difficulties to integratea strategy of intrinsic motivation to an actor-critic agent toimprove the sampling efficiency. We found results that point to theeffectiveness of the intrinsic motivation as a approach to improvethe agent’s sampling efficiency, as well as its performance. Weshare practical guidelines to assist in the implementation of actor-critic agents to deal with sparse reward environments whilemaking use of intrinsic motivation feedback.


2021 ◽  
Vol 11 (21) ◽  
pp. 10337
Author(s):  
Junkai Ren ◽  
Yujun Zeng ◽  
Sihang Zhou ◽  
Yichuan Zhang

Scaling end-to-end learning to control robots with vision inputs is a challenging problem in the field of deep reinforcement learning (DRL). While achieving remarkable success in complex sequential tasks, vision-based DRL remains extremely data-inefficient, especially when dealing with high-dimensional pixels inputs. Many recent studies have tried to leverage state representation learning (SRL) to break through such a barrier. Some of them could even help the agent learn from pixels as efficiently as from states. Reproducing existing work, accurately judging the improvements offered by novel methods, and applying these approaches to new tasks are vital for sustaining this progress. However, the demands of these three aspects are seldom straightforward. Without significant criteria and tighter standardization of experimental reporting, it is difficult to determine whether improvements over the previous methods are meaningful. For this reason, we conducted ablation studies on hyperparameters, embedding network architecture, embedded dimension, regularization methods, sample quality and SRL methods to compare and analyze their effects on representation learning and reinforcement learning systematically. Three evaluation metrics are summarized, including five baseline algorithms (including both value-based and policy-based methods) and eight tasks are adopted to avoid the particularity of each experiment setting. We highlight the variability in reported methods and suggest guidelines to make future results in SRL more reproducible and stable based on a wide number of experimental analyses. We aim to spur discussion about how to assure continued progress in the field by minimizing wasted effort stemming from results that are non-reproducible and easily misinterpreted.


IEEE Access ◽  
2016 ◽  
Vol 4 ◽  
pp. 6304-6324 ◽  
Author(s):  
Aqeel Raza Syed ◽  
Kok-Lim Alvin Yau ◽  
Junaid Qadir ◽  
Hafizal Mohamad ◽  
Nordin Ramli ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document