A new condition for the existence of optimum stationary policies in average cost Markov decision processes - Unbounded cost case

Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under average criterion

The ANZIAM Journal ◽

10.1017/s144618110001213x ◽

2002 ◽

Vol 43 (4) ◽

pp. 541-557 ◽

Cited By ~ 10

Author(s):

Xianping Guo ◽

Weiping Zhu

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

Transition Rates ◽

Birth And Death Processes ◽

Optimality Equation ◽

Average Criterion ◽

Markov Decision ◽

Unbounded Cost ◽

Queue Model

AbstractIn this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission control queue model and controlled birth and death processes.

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1017/s000186780000447x ◽

2010 ◽

Vol 42 (04) ◽

pp. 953-985 ◽

Cited By ~ 2

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

Constrained Optimization for Average Cost Continuous-Time Markov Decision Processes

IEEE Transactions on Automatic Control ◽

10.1109/tac.2007.899040 ◽

2007 ◽

Vol 52 (6) ◽

pp. 1139-1143 ◽

Cited By ~ 20

Author(s):

Xianping Guo

Keyword(s):

Constrained Optimization ◽

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Decision Processes ◽

Markov Decision

Download Full-text

Unbounded cost Markov decision processes with limsup and liminf average criteria: new conditions

Mathematical Methods of Operations Research ◽

10.1007/s001860400408 ◽

2005 ◽

Vol 61 (3) ◽

pp. 469-482 ◽

Cited By ~ 9

Author(s):

Quanxin Zhu ◽

Xianping Guo ◽

Yonglong Dai

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Unbounded Cost

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1239/aap/1293113146 ◽

2010 ◽

Vol 42 (4) ◽

pp. 953-985 ◽

Cited By ~ 9

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes

Discrete Event Dynamic Systems ◽

10.1007/s10626-006-0003-y ◽

2007 ◽

Vol 17 (1) ◽

pp. 23-52 ◽

Cited By ~ 9

Author(s):

Mohammed Shahid Abdulla ◽

Shalabh Bhatnagar

Keyword(s):

Reinforcement Learning ◽

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

Markov Decision

Download Full-text

Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space

Mathematical Methods of Operations Research ◽

10.1007/s001860200256 ◽

2003 ◽

Vol 57 (2) ◽

pp. 263-285 ◽

Cited By ~ 10

Author(s):

Rolando Cavazos-Cadena

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

Optimality Equation ◽

Risk Sensitive ◽

Finite State ◽

Markov Decision ◽

Average Cost Optimality Equation ◽

Cost Optimality ◽

Finite State Space

Download Full-text

Approximation of average cost optimal policies for general Markov decision processes with unbounded costs

Mathematical Methods of Operations Research ◽

10.1007/bf01193864 ◽

1997 ◽

Vol 45 (2) ◽

pp. 245-263

Author(s):

Evgueni Gordienko ◽

Ra�l Montes-De-Oca ◽

Adolfo Minj�rez-Sosa

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

Optimal Policies ◽

Markov Decision

Download Full-text

Optimality Inequalities for Average Cost Markov Decision Processes and the Stochastic Cash Balance Problem

Mathematics of Operations Research ◽

10.1287/moor.1070.0269 ◽

2007 ◽

Vol 32 (4) ◽

pp. 769-783 ◽

Cited By ~ 35

Author(s):

Eugene A. Feinberg ◽

Mark E. Lewis

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

Cash Balance ◽

Balance Problem ◽

Markov Decision

Download Full-text

Average Cost Semi-Markov Decision Processes and the Control of Queueing Systems

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800001121 ◽

1989 ◽

Vol 3 (2) ◽

pp. 247-272 ◽

Cited By ~ 47

Author(s):

Linn I. Sennott

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Queueing Systems ◽

Decision Processes ◽

Single Server ◽

Stationary Policy ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Poisson Arrivals ◽

Action Spaces

Semi-Markov decision processes underlie the control of many queueing systems. In this paper, we deal with infinite state semi-Markov decision processes with nonnegative, unbounded costs and finite action sets. Axioms for the existence of an expected average cost optimal stationary policy are presented. These conditions generalize the work in Sennott [22] for Markov decision processes. Verifiable conditions for the axioms to hold are obtained. The theory is applied to control of the M/G/l queue with variable service parameter, with on-off server, and with batch processing, and to control of the G/M/m queue with variable arrival parameter and customer rejection. It is applied to a timesharing network of queues with a single server and finally to optimal routing of Poisson arrivals to parallel exponential servers. The final section extends the existence result to compact action spaces.

Download Full-text