On a reduction principle in dynamic programming

Whittle enunciated an important reduction principle in dynamic programming when he showed that under certain conditions optimal strategies for Markov decision processes (MDPs) placed in parallel to one another take actions in a way which is consistent with the optimal strategies for the individual MDPs. However, the necessary and sufficient conditions given by Whittle are by no means always satisfied. We explore the status of this computationally attractive reduction principle when these conditions fail.

Download Full-text

Generalized semi-Markov decision processes

Journal of Applied Probability ◽

10.1017/s0021900200107740 ◽

1979 ◽

Vol 16 (03) ◽

pp. 618-630

Author(s):

Bharat T. Doshi

Keyword(s):

Markov Decision Processes ◽

Sufficient Conditions ◽

Decision Processes ◽

The State ◽

Service Rate ◽

Storage Model ◽

Sample Paths ◽

Markov Decision ◽

Sufficient Conditions For Optimality ◽

Necessary And Sufficient

Various authors have derived the necessary and sufficient conditions for optimality in semi-Markov decision processes in which the state remains constant between jumps. In this paper similar results are presented for a generalized semi-Markov decision process in which the state varies between jumps according to a Markov process with continuous sample paths. These results are specialized to a general storage model and an application to the service rate control in a GI/G/1 queue is indicated.

Download Full-text

Generalized semi-Markov decision processes

Journal of Applied Probability ◽

10.2307/3213089 ◽

1979 ◽

Vol 16 (3) ◽

pp. 618-630 ◽

Cited By ~ 9

Author(s):

Bharat T. Doshi

Keyword(s):

Markov Decision Processes ◽

Sufficient Conditions ◽

Decision Processes ◽

The State ◽

Service Rate ◽

Storage Model ◽

Sample Paths ◽

Markov Decision ◽

Sufficient Conditions For Optimality ◽

Necessary And Sufficient

Various authors have derived the necessary and sufficient conditions for optimality in semi-Markov decision processes in which the state remains constant between jumps. In this paper similar results are presented for a generalized semi-Markov decision process in which the state varies between jumps according to a Markov process with continuous sample paths. These results are specialized to a general storage model and an application to the service rate control in a GI/G/1 queue is indicated.

Download Full-text

A semimartingale characterization of average optimal stationary policies for Markov decision processes

Journal of Applied Mathematics and Stochastic Analysis ◽

10.1155/jamsa/2006/81593 ◽

2006 ◽

Vol 2006 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

Quanxin Zhu ◽

Xianping Guo

Keyword(s):

Markov Decision Processes ◽

Sufficient Conditions ◽

Decision Processes ◽

Stationary Policy ◽

Optimal Policies ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Necessary And Sufficient ◽

Action Spaces

This paper deals with discrete-time Markov decision processes with Borel state and action spaces. The criterion to be minimized is the average expected costs, and the costs may have neither upper nor lower bounds. In our former paper (to appear in Journal of Applied Probability), weaker conditions are proposed to ensure the existence of average optimal stationary policies. In this paper, we further study some properties of optimal policies. Under these weaker conditions, we not only obtain two necessary and sufficient conditions for optimal policies, but also give a semimartingale characterization of an average optimal stationary policy.

Download Full-text

Extreme-point solutions in Markov decision processes

Journal of Applied Probability ◽

10.1017/s002190020002413x ◽

1983 ◽

Vol 20 (04) ◽

pp. 835-842

Author(s):

David Assaf

Keyword(s):

Convex Function ◽

Extreme Point ◽

Markov Decision Processes ◽

Convex Functions ◽

Sufficient Conditions ◽

Decision Processes ◽

Markov Decision ◽

Full Solution

The paper presents sufficient conditions for certain functions to be convex. Functions of this type often appear in Markov decision processes, where their maximum is the solution of the problem. Since a convex function takes its maximum at an extreme point, the conditions may greatly simplify a problem. In some cases a full solution may be obtained after the reduction is made. Some illustrative examples are discussed.

Download Full-text

Impulsive Control for Continuous-Time Markov Decision Processes

Advances in Applied Probability ◽

10.1239/aap/1427814583 ◽

2015 ◽

Vol 47 (1) ◽

pp. 106-127 ◽

Cited By ~ 6

Author(s):

François Dufour ◽

Alexei B. Piunovskiy

Keyword(s):

Optimal Control ◽

Control Problem ◽

Markov Decision Processes ◽

Control Strategy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Optimal Control Strategy ◽

Optimality Equation ◽

Markov Decision

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

Download Full-text

Dynamic Capabilities and Where to Find Them

Journal of Management Inquiry ◽

10.1177/1056492617730126 ◽

2017 ◽

Vol 29 (1) ◽

pp. 3-16 ◽

Cited By ~ 2

Author(s):

Seidali Kurtmollaiev

Keyword(s):

Dynamic Capabilities ◽

Sufficient Conditions ◽

Resource Base ◽

Firm Level ◽

Individual Level ◽

The Status ◽

The Individual ◽

High Level ◽

Necessary And Sufficient ◽

Intention To Change

Despite its immense popularity, the dynamic capabilities framework faces fierce criticism because of the ambiguous and contradictory interpretations of dynamic capabilities. Especially challenging are the aspects related to the nature of dynamic capabilities and the issue of agency. In an attempt to avoid circular and overlapping definitions, I explicate dynamic capabilities as the regular actions of creating, extending, and modifying an organizational resource base. This implies that the individual’s intention to change the status quo in the organization and the individual’s high level of influence in the organization are necessary and sufficient conditions for dynamic capabilities. This approach overcomes challenges associated with current interpretations of dynamic capabilities, necessarily focusing on the actions and interactions of individuals in organizations. Following the micro-foundations movement, I present a multilevel approach for studying the individual-level causes and the firm-level effects of dynamic capabilities.

Download Full-text

A dynamic programming algorithm for decentralized Markov decision processes with a broadcast structure

49th IEEE Conference on Decision and Control (CDC) ◽

10.1109/cdc.2010.5718187 ◽

2010 ◽

Cited By ~ 12

Author(s):

Jeff Wu ◽

Sanjay Lall

Keyword(s):

Dynamic Programming ◽

Markov Decision Processes ◽

Dynamic Programming Algorithm ◽

Decision Processes ◽

Programming Algorithm ◽

Markov Decision

Download Full-text

Impulsive Control for Continuous-Time Markov Decision Processes

Advances in Applied Probability ◽

10.1017/s0001867800007722 ◽

2015 ◽

Vol 47 (01) ◽

pp. 106-127 ◽

Cited By ~ 2

Author(s):

François Dufour ◽

Alexei B. Piunovskiy

Keyword(s):

Optimal Control ◽

Control Problem ◽

Markov Decision Processes ◽

Control Strategy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Optimal Control Strategy ◽

Optimality Equation ◽

Markov Decision

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

Download Full-text

Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model

Advances in Applied Probability ◽

10.2307/1426437 ◽

1983 ◽

Vol 15 (2) ◽

pp. 274-303 ◽

Cited By ~ 28

Author(s):

Arie Hordijk ◽

Frank A. Van Der Duyn Schouten

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Time Parameter ◽

Queueing Model ◽

Replacement Model ◽

Optimal Policies ◽

Markov Decision

Recently the authors introduced the concept of Markov decision drift processes. A Markov decision drift process can be seen as a straightforward generalization of a Markov decision process with continuous time parameter. In this paper we investigate the existence of stationary average optimal policies for Markov decision drift processes. Using a well-known Abelian theorem we derive sufficient conditions, which guarantee that a ‘limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy. An alternative set of sufficient conditions is obtained for the case in which the discounted optimal policies generate regenerative stochastic processes. The latter set of conditions is easier to verify in several applications. The results of this paper are also applicable to Markov decision processes with discrete or continuous time parameter and to semi-Markov decision processes. In this sense they generalize some well-known results for Markov decision processes with finite or compact action space. Applications to an M/M/1 queueing model and a maintenance replacement model are given. It is shown that under certain conditions on the model parameters the average optimal policy for the M/M/1 queueing model is monotone non-decreasing (as a function of the number of waiting customers) with respect to the service intensity and monotone non-increasing with respect to the arrival intensity. For the maintenance replacement model we prove the average optimality of a bang-bang type policy. Special attention is paid to the computation of the optimal control parameters.

Download Full-text