Reinforcement Learning Approach to AIBO Robot's Decision Making Process in Robosoccer's Goal Keeper Problem

Author(s):  
Subhasis Mukherjee ◽  
John Yearwood ◽  
Peter Vamplew ◽  
Shamsul Huda
2020 ◽  
Vol 34 (04) ◽  
pp. 6210-6218
Author(s):  
Jun Wang ◽  
Hefu Zhang ◽  
Qi Liu ◽  
Zhen Pan ◽  
Hanqing Tao

Recent years have witnessed the increasing interests in research of crowdfunding mechanism. In this area, dynamics tracking is a significant issue but is still under exploration. Existing studies either fit the fluctuations of time-series or employ regularization terms to constrain learned tendencies. However, few of them take into account the inherent decision-making process between investors and crowdfunding dynamics. To address the problem, in this paper, we propose a Trajectory-based Continuous Control for Crowdfunding (TC3) algorithm to predict the funding progress in crowdfunding. Specifically, actor-critic frameworks are employed to model the relationship between investors and campaigns, where all of the investors are viewed as an agent that could interact with the environment derived from the real dynamics of campaigns. Then, to further explore the in-depth implications of patterns (i.e., typical characters) in funding series, we propose to subdivide them into fast-growing and slow-growing ones. Moreover, for the purpose of switching from different kinds of patterns, the actor component of TC3 is extended with a structure of options, which comes to the TC3-Options. Finally, extensive experiments on the Indiegogo dataset not only demonstrate the effectiveness of our methods, but also validate our assumption that the entire pattern learned by TC3-Options is indeed the U-shaped one.


Energies ◽  
2019 ◽  
Vol 12 (8) ◽  
pp. 1556 ◽  
Author(s):  
Cao ◽  
Zhang ◽  
Xiao ◽  
Hua

The existence of high proportional distributed energy resources in energy Internet (EI) scenarios has a strong impact on the power supply-demand balance of the EI system. Decision-making optimization research that focuses on the transient voltage stability is of great significance for maintaining effective and safe operation of the EI. Within a typical EI scenario, this paper conducts a study of transient voltage stability analysis based on convolutional neural networks. Based on the judgment of transient voltage stability, a reactive power compensation decision optimization algorithm via deep reinforcement learning approach is proposed. In this sense, the following targets are achieved: the efficiency of decision-making is greatly improved, risks are identified in advance, and decisions are made in time. Simulations show the effectiveness of our proposed method.


2013 ◽  
Vol 92 (1) ◽  
pp. 5-39 ◽  
Author(s):  
Markus Peters ◽  
Wolfgang Ketter ◽  
Maytal Saar-Tsechansky ◽  
John Collins

Sign in / Sign up

Export Citation Format

Share Document