scholarly journals A LEARNING ALGORITHM FOR COMMUNICATING MARKOV DECISION PROCESSES WITH UNKNOWN TRANSITION MATRICES

10.5109/16771 ◽  
2007 ◽  
Vol 39 ◽  
pp. 11-24
Author(s):  
Tetsuichiro Iki ◽  
Masayuki Horiguchi ◽  
Masami Yasuda ◽  
Masami Kurano
1987 ◽  
Vol 24 (01) ◽  
pp. 270-276
Author(s):  
Masami Kurano

This study is concerned with finite Markov decision processes whose dynamics and reward structure are unknown but the state is observable exactly. We establish a learning algorithm which yields an optimal policy and construct an adaptive policy which is optimal under the average expected reward criterion.


Sign in / Sign up

Export Citation Format

Share Document