Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces
2014 ◽
Vol 50
◽
pp. 763-803
◽
The policy iteration algorithm for average reward Markov decision processes with general state space
1997 ◽
Vol 42
(12)
◽
pp. 1663-1680
◽
2016 ◽
Vol 133
(10)
◽
pp. 28-33
◽
2009 ◽
Vol 34
◽
pp. 89-132
◽
2014 ◽
Vol 287
(04)
◽
pp. 103-124
◽