Policy gradient stochastic approximation algorithms for adaptive control of constrained time varying markov decision processes
Keyword(s):
2004 ◽
Vol 49
(4)
◽
pp. 592-598
◽