A mathematical programming approach to a problem in variance penalised Markov decision processes

In this work, we study discrete-time Markov decision processes (MDPs) with constraints when all the objectives have the same form of expected total cost over the infinite time horizon. Our objective is to analyze this problem by using the linear programming approach. Under some technical hypotheses, it is shown that if there exists an optimal solution for the associated linear program then there exists a randomized stationary policy which is optimal for the MDP, and that the optimal value of the linear program coincides with the optimal value of the constrained control problem. A second important result states that the set of randomized stationary policies provides a sufficient set for solving this MDP. It is important to note that, in contrast with the classical results of the literature, we do not assume the MDP to be transient or absorbing. More importantly, we do not impose the cost functions to be nonnegative or to be bounded below. Several examples are presented to illustrate our results.

Download Full-text

Impulsive Control for Continuous-Time Markov Decision Processes: A Linear Programming Approach

Applied Mathematics & Optimization ◽

10.1007/s00245-015-9310-8 ◽

2015 ◽

Vol 74 (1) ◽

pp. 129-161 ◽

Cited By ~ 8

Author(s):

F. Dufour ◽

A. B. Piunovskiy

Keyword(s):

Linear Programming ◽

Markov Decision Processes ◽

Continuous Time ◽

Impulsive Control ◽

Decision Processes ◽

Programming Approach ◽

Linear Programming Approach ◽

Markov Decision

Download Full-text

Using mathematical programming to solve Factored Markov Decision Processes with Imprecise Probabilities

International Journal of Approximate Reasoning ◽

10.1016/j.ijar.2011.04.002 ◽

2011 ◽

Vol 52 (7) ◽

pp. 1000-1017 ◽

Cited By ~ 5

Author(s):

Karina Valdivia Delgado ◽

Leliane Nunes de Barros ◽

Fabio Gagliardi Cozman ◽

Scott Sanner

Keyword(s):

Mathematical Programming ◽

Markov Decision Processes ◽

Decision Processes ◽

Imprecise Probabilities ◽

Markov Decision

Download Full-text

On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs

Mathematics of Operations Research ◽

10.1287/moor.2021.1177 ◽

2021 ◽

Author(s):

Huizhen Yu

Keyword(s):

Linear Programming ◽

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

The State ◽

Programming Approach ◽

One Stage ◽

Markov Decision ◽

Action Spaces ◽

Borel Measurable

We consider the linear programming approach for constrained and unconstrained Markov decision processes (MDPs) under the long-run average-cost criterion, where the class of MDPs in our study have Borel state spaces and discrete countable action spaces. Under a strict unboundedness condition on the one-stage costs and a recently introduced majorization condition on the state transition stochastic kernel, we study infinite-dimensional linear programs for the average-cost MDPs and prove the absence of a duality gap and other optimality results. Our results do not require a lower-semicontinuous MDP model. Thus, they can be applied to countable action space MDPs where the dynamics and one-stage costs are discontinuous in the state variable. Our proofs make use of the continuity property of Borel measurable functions asserted by Lusin’s theorem.

Download Full-text

The Linear Programming Approach to Reach-Avoid Problems for Markov Decision Processes

Journal of Artificial Intelligence Research ◽

10.1613/jair.5500 ◽

2017 ◽

Vol 60 ◽

pp. 263-285 ◽

Cited By ~ 3

Author(s):

Nikolaos Kariotoglou ◽

Maryam Kamgarpour ◽

Tyler H. Summers ◽

John Lygeros

Keyword(s):

Markov Decision Processes ◽

Dimensional Space ◽

Decision Processes ◽

Linear Program ◽

Control Synthesis ◽

Programming Approach ◽

Finite Dimensional Space ◽

Infinite Dimensional ◽

Finite Dimensional ◽

Markov Decision

One of the most fundamental problems in Markov decision processes is analysis and control synthesis for safety and reachability specifications. We consider the stochastic reach-avoid problem, in which the objective is to synthesize a control policy to maximize the probability of reaching a target set at a given time, while staying in a safe set at all prior times. We characterize the solution to this problem through an infinite dimensional linear program. We then develop a tractable approximation to the infinite dimensional linear program through finite dimensional approximations of the decision space and constraints. For a large class of Markov decision processes modeled by Gaussian mixtures kernels we show that through a proper selection of the finite dimensional space, one can further reduce the computational complexity of the resulting linear program. We validate the proposed method and analyze its potential with numerical case studies.

Download Full-text

A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes

Operations Research ◽

10.1287/opre.1120.1121 ◽

2013 ◽

Vol 61 (2) ◽

pp. 413-425 ◽

Cited By ~ 19

Author(s):

Archis Ghate ◽

Robert L. Smith

Keyword(s):

Linear Programming ◽

Markov Decision Processes ◽

Infinite Horizon ◽

Decision Processes ◽

Programming Approach ◽

Linear Programming Approach ◽

Markov Decision

Download Full-text

A Dynamic Programming Approach for Ambient Intelligence Platforms in Running Sports Based on Markov Decision Processes

Advances in Intelligent and Soft Computing - Human – Computer Systems Interaction: Backgrounds and Applications 2 ◽

10.1007/978-3-642-23187-2_11 ◽

2012 ◽

pp. 165-181 ◽

Cited By ~ 2

Author(s):

J. Vales-Alonso ◽

P. López-Matencio ◽

J. J. Alcaraz ◽

J. L. Sieiro-Lomba ◽

E. Costa-Montenegro ◽

...

Keyword(s):

Dynamic Programming ◽

Markov Decision Processes ◽

Ambient Intelligence ◽

Decision Processes ◽

Programming Approach ◽

Dynamic Programming Approach ◽

Markov Decision

Download Full-text

The Expected Total Cost Criterion for Markov Decision Processes under Constraints

Advances in Applied Probability ◽

10.1017/s0001867800006601 ◽

2013 ◽

Vol 45 (03) ◽

pp. 837-859 ◽

Cited By ~ 1

Author(s):

François Dufour ◽

A. B. Piunovskiy

Keyword(s):

Markov Decision Processes ◽

Optimal Solution ◽

Decision Processes ◽

Linear Program ◽

Programming Approach ◽

Stationary Policy ◽

Total Cost ◽

Optimal Value ◽

Markov Decision ◽

Expected Total Cost

In this work, we study discrete-time Markov decision processes (MDPs) with constraints when all the objectives have the same form of expected total cost over the infinite time horizon. Our objective is to analyze this problem by using the linear programming approach. Under some technical hypotheses, it is shown that if there exists an optimal solution for the associated linear program then there exists a randomized stationary policy which is optimal for the MDP, and that the optimal value of the linear program coincides with the optimal value of the constrained control problem. A second important result states that the set of randomized stationary policies provides a sufficient set for solving this MDP. It is important to note that, in contrast with the classical results of the literature, we do not assume the MDP to be transient or absorbing. More importantly, we do not impose the cost functions to be nonnegative or to be bounded below. Several examples are presented to illustrate our results.

Download Full-text