Value iteration in countable state average cost Markov decision processes with unbounded costs

As many real applications need a large amount of states, the classical methods are intractable for solving large Markov Decision Processes. The decomposition technique basing on the topology of each state in the associated graph and the parallelization technique are very useful methods to cope with this problem. In this paper, the authors propose a Modified Value Iteration algorithm, adding the parallelism technique. They test their implementation on artificial data using an Open MP that offers a significant speed-up.

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1017/s000186780000447x ◽

2010 ◽

Vol 42 (04) ◽

pp. 953-985 ◽

Cited By ~ 2

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

Constrained Optimization for Average Cost Continuous-Time Markov Decision Processes

IEEE Transactions on Automatic Control ◽

10.1109/tac.2007.899040 ◽

2007 ◽

Vol 52 (6) ◽

pp. 1139-1143 ◽

Cited By ~ 20

Author(s):

Xianping Guo

Keyword(s):

Constrained Optimization ◽

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Decision Processes ◽

Markov Decision

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1239/aap/1293113146 ◽

2010 ◽

Vol 42 (4) ◽

pp. 953-985 ◽

Cited By ~ 9

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text