Nonuniqueness versus Uniqueness of Optimal Policies in Convex Discounted Markov Decision Processes

Journal of Applied Mathematics ◽

10.1155/2013/271279 ◽

2013 ◽

Vol 2013 ◽

pp. 1-5

Author(s):

Raúl Montes-de-Oca ◽

Enrique Lemus-Rodríguez ◽

Francisco Sergio Salem-Silva

Keyword(s):

Markov Decision Processes ◽

Optimization Problems ◽

Decision Processes ◽

Point Of View ◽

Value Functions ◽

Optimal Policies ◽

Markov Decision ◽

The Stability ◽

The Cost ◽

Classical Point

From the classical point of view, it is important to determine if in a Markov decision process (MDP), besides their existence, the uniqueness of the optimal policies is guaranteed. It is well known that uniqueness does not always hold in optimization problems (for instance, in linear programming). On the other hand, in such problems it is possible for a slight perturbation of the functional cost to restore the uniqueness. In this paper, it is proved that the value functions of an MDP and its cost perturbed version stay close, under adequate conditions, which in some sense is a priority. We are interested in the stability of Markov decision processes with respect to the perturbations of the cost-as-you-go function.

Download Full-text

Stochastic Comparative Statics in Markov Decision Processes

Mathematics of Operations Research ◽

10.1287/moor.2020.1086 ◽

2021 ◽

Author(s):

Bar Light

Keyword(s):

Markov Decision Processes ◽

Dynamic Pricing ◽

Optimization Problems ◽

Transition Probability ◽

Random Variable ◽

Comparative Statics ◽

Decision Processes ◽

Optimal Decision ◽

Initial State ◽

Markov Decision

In multiperiod stochastic optimization problems, the future optimal decision is a random variable whose distribution depends on the parameters of the optimization problem. I analyze how the expected value of this random variable changes as a function of the dynamic optimization parameters in the context of Markov decision processes. I call this analysis stochastic comparative statics. I derive both comparative statics results and stochastic comparative statics results showing how the current and future optimal decisions change in response to changes in the single-period payoff function, the discount factor, the initial state of the system, and the transition probability function. I apply my results to various models from the economics and operations research literature, including investment theory, dynamic pricing models, controlled random walks, and comparisons of stationary distributions.

Download Full-text

Overview of Decision Support Systems Applied to Construction

Decision Support for Construction Cost Control in Developing Countries ◽

10.4018/978-1-4666-9873-4.ch004 ◽

2016 ◽

pp. 77-94

Keyword(s):

Decision Support ◽

Decision Support Systems ◽

Markov Decision Processes ◽

Support Systems ◽

Decision Processes ◽

Rule Based ◽

Markov Decision ◽

Knowledge Intensive ◽

The Cost ◽

Rule Based Systems

The domain of construction is a very knowledge-intensive domain with so many factors involved. This implies undertaking any action requires an understanding of the different factors and how best to combine them to achieve a favourable and optimal outcome. Thus decision-making has been extensively used in the domain of construction. The aim of this chapter is to undertake a review of various decision support systems and to provide insights into their applications in the domain of construction. Specifically, the principle of cost index, sub-work chaining diagram method, linear regression and cost over-runs in time-overrun context (CCOTOV) model and Markov decision processes (MDP), ontology and rule-based systems have been reviewed. Based on the review the Markov decision processes (MDP), ontology and rule-based systems were chosen as the more suitable for the cost control case considered in this study.

Download Full-text

Convergence of Value Functions for Finite Horizon Markov Decision Processes with Constraints

Applied Mathematics & Optimization ◽

10.1007/s00245-020-09707-x ◽

2020 ◽

Author(s):

Naoyuki Ichihara

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Finite Horizon ◽

Value Functions ◽

Markov Decision

Download Full-text

Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model

Advances in Applied Probability ◽

10.2307/1426437 ◽

1983 ◽

Vol 15 (2) ◽

pp. 274-303 ◽

Cited By ~ 28

Author(s):

Arie Hordijk ◽

Frank A. Van Der Duyn Schouten

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Time Parameter ◽

Queueing Model ◽

Replacement Model ◽

Optimal Policies ◽

Markov Decision

Recently the authors introduced the concept of Markov decision drift processes. A Markov decision drift process can be seen as a straightforward generalization of a Markov decision process with continuous time parameter. In this paper we investigate the existence of stationary average optimal policies for Markov decision drift processes. Using a well-known Abelian theorem we derive sufficient conditions, which guarantee that a ‘limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy. An alternative set of sufficient conditions is obtained for the case in which the discounted optimal policies generate regenerative stochastic processes. The latter set of conditions is easier to verify in several applications. The results of this paper are also applicable to Markov decision processes with discrete or continuous time parameter and to semi-Markov decision processes. In this sense they generalize some well-known results for Markov decision processes with finite or compact action space. Applications to an M/M/1 queueing model and a maintenance replacement model are given. It is shown that under certain conditions on the model parameters the average optimal policy for the M/M/1 queueing model is monotone non-decreasing (as a function of the number of waiting customers) with respect to the service intensity and monotone non-increasing with respect to the arrival intensity. For the maintenance replacement model we prove the average optimality of a bang-bang type policy. Special attention is paid to the computation of the optimal control parameters.

Download Full-text

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Mathematical Methods of Operations Research ◽

10.1007/s001860400372 ◽

2004 ◽

Vol 60 (3) ◽

pp. 415-436 ◽

Cited By ~ 12

Author(s):

Daniel Cruz-Su�rez ◽

Ra�l Montes-de-Oca ◽

Francisco Salem-Silva

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Optimal Policies ◽

Markov Decision

Download Full-text

Approximation of average cost optimal policies for general Markov decision processes with unbounded costs

Mathematical Methods of Operations Research ◽

10.1007/bf01193864 ◽

1997 ◽

Vol 45 (2) ◽

pp. 245-263

Author(s):

Evgueni Gordienko ◽

Ra�l Montes-De-Oca ◽

Adolfo Minj�rez-Sosa

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

Optimal Policies ◽

Markov Decision

Download Full-text

The Determination of Approximately Optimal Policies in Markov Decision Processes by the Use of Bounds

Journal of the Operational Research Society ◽

10.2307/2581490 ◽

1982 ◽

Vol 33 (3) ◽

pp. 253

Author(s):

D. J. White

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Optimal Policies ◽

Markov Decision

Download Full-text

Cost rate heuristics for semi-Markov decision processes

Journal of Applied Probability ◽

10.1017/s002190020004345x ◽

1992 ◽

Vol 29 (03) ◽

pp. 633-644

Author(s):

K. D. Glazebrook ◽

Michael P. Bailey ◽

Lyn R. Whitaker

Keyword(s):

Dynamic Programming ◽

Markov Decision Processes ◽

Preventive Maintenance ◽

Decision Processes ◽

Cost Rate ◽

Backwards Induction ◽

Optimal Policies ◽

Markov Decision ◽

Speed Of Evolution

In response to the computational complexity of the dynamic programming/backwards induction approach to the development of optimal policies for semi-Markov decision processes, we propose a class of heuristics resulting from an inductive process which proceeds forwards in time. These heuristics always choose actions in such a way as to minimize some measure of the current cost rate. We describe a procedure for calculating such cost rate heuristics. The quality of the performance of such policies is related to the speed of evolution (in a cost sense) of the process. A simple model of preventive maintenance is described in detail. Cost rate heuristics for this problem are calculated and assessed computationally.

Download Full-text

Determining the Best Trade-Off Between Expected Economic Return and the Risk of Undesirable Events When Managing a Randomly Varying Population

Journal of the Fisheries Research Board of Canada ◽

10.1139/f79-131 ◽

1979 ◽

Vol 36 (8) ◽

pp. 939-947 ◽

Cited By ~ 10

Author(s):

Roy Mendelssohn

Keyword(s):

Markov Decision Processes ◽

Population Based ◽

Decision Processes ◽

Expected Return ◽

Trade Off ◽

Long Run ◽

Optimal Policies ◽

Markov Decision ◽

Long Run Risk ◽

Varying Population

Conditions are given that imply there exist policies that "minimize risk" of undesirable events for stochastic harvesting models. It is shown that for many problems, either such a policy will not exist, or else it is an "extreme" policy that is equally undesirable. Techniques are given to systematically trade-off decreases in the long-run expected return with decreases in the long-run risk. Several numerical examples are given for models of salmon runs, when both population-based risks and harvest-based risks are considered. Key words: Markov decision processes, risk, salmon management, Pareto optimal policies, trade-off curves, linear programing

Download Full-text

Weighted difference approximation of value functions for slow-discounting Markov Decision Processes

53rd IEEE Conference on Decision and Control ◽

10.1109/cdc.2014.7039526 ◽

2014 ◽

Author(s):

Yin-Lam Chow ◽

Junjie Qin

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Difference Approximation ◽

Value Functions ◽

Markov Decision

Download Full-text