scholarly journals A Pre-Trained Fuzzy Reinforcement Learning Method for the Pursuing Satellite in a One-to-One Game in Space

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2253
Author(s):  
Xiao Wang ◽  
Peng Shi ◽  
Yushan Zhao ◽  
Yue Sun

In order to help the pursuer find its advantaged control policy in a one-to-one game in space, this paper proposes an innovative pre-trained fuzzy reinforcement learning algorithm, which is conducted in the x, y, and z channels separately. Compared with the previous algorithms applied in ground games, this is the first time reinforcement learning has been introduced to help the pursuer in space optimize its control policy. The known part of the environment is utilized to help the pursuer pre-train its consequent set before learning. An actor-critic framework is built in each moving channel of the pursuer. The consequent set of the pursuer is updated through the gradient descent method in fuzzy inference systems. The numerical experimental results validate the effectiveness of the proposed algorithm in improving the game ability of the pursuer.

2019 ◽  
Vol 9 (21) ◽  
pp. 4568
Author(s):  
Hyeyoung Park ◽  
Kwanyong Lee

Gradient descent method is an essential algorithm for learning of neural networks. Among diverse variations of gradient descent method that have been developed for accelerating learning speed, the natural gradient learning is based on the theory of information geometry on stochastic neuromanifold, and is known to have ideal convergence properties. Despite its theoretical advantages, the pure natural gradient has some limitations that prevent its practical usage. In order to get the explicit value of the natural gradient, it is required to know true probability distribution of input variables, and to calculate inverse of a matrix with the square size of the number of parameters. Though an adaptive estimation of the natural gradient has been proposed as a solution, it was originally developed for online learning mode, which is computationally inefficient for the learning of large data set. In this paper, we propose a novel adaptive natural gradient estimation for mini-batch learning mode, which is commonly adopted for big data analysis. For two representative stochastic neural network models, we present explicit rules of parameter updates and learning algorithm. Through experiments on three benchmark problems, we confirm that the proposed method has superior convergence properties to the conventional methods.


Author(s):  
Naoyoshi Yubazaki ◽  
◽  
Jianqiang Yi ◽  
Kaoru Hirota ◽  

A new fuzzy inference model, SIRMs (Single Input Rule Modules) Connected Fuzzy Inference Model, is proposed for plural input fuzzy control. For each input item, an importance degree is defined and single input fuzzy rule module is constructed. The importance degrees control the roles of the input items in systems. The model output is obtained by the summation of the products of the importance degree and the fuzzy inference result of each SIRM. The proposed model needs both very few rules and parameters, and the rules can be designed much easier. The new model is first applied to typical secondorder lag systems. The simulation results show that the proposed model can largely improve the control performance compared with that of the conventional fuzzy inference model. The tuning algorithm is then given based on the gradient descent method and used to adjust the parameters of the proposed model for identifying 4-input 1-output nonlinear functions. The identification results indicate that the proposed model also has the ability to identify nonlinear systems.


2009 ◽  
Vol 2009 ◽  
pp. 1-11 ◽  
Author(s):  
Jun Namikawa ◽  
Jun Tani

The present paper proposes a recurrent neural network model and learning algorithm that can acquire the ability to generate desired multiple sequences. The network model is a dynamical system in which the transition function is a contraction mapping, and the learning algorithm is based on the gradient descent method. We show a numerical simulation in which a recurrent neural network obtains a multiple periodic attractor consisting of five Lissajous curves, or a Van der Pol oscillator with twelve different parameters. The present analysis clarifies that the model contains many stable regions as attractors, and multiple time series can be embedded into these regions by using the present learning method.


Sign in / Sign up

Export Citation Format

Share Document