monotone policies Latest Research Papers

Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach * *This work was partially supported by the Swedish Research Council under contract 2016-06079 and the Linnaeus Center ACCESS at KTH.

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2017.08.1575 ◽

2017 ◽

Vol 50 (1) ◽

pp. 8429-8434

Author(s):

Robert Mattila ◽

Cristian R. Rojas ◽

Vikram Krishnamurthy ◽

Bo Wahlberg

Keyword(s):

Markov Decision Processes ◽

Research Council ◽

Decision Processes ◽

Penalty Approach ◽

Swedish Research Council ◽

Markov Decision ◽

Monotone Policies

Download Full-text

OPTIMALLY REPLACING MULTIPLE SYSTEMS IN A SHARED ENVIRONMENT

Probability in the Engineering and Informational Sciences ◽

10.1017/s026996481700016x ◽

2017 ◽

Vol 32 (2) ◽

pp. 179-206

Author(s):

David T. Abdul-Malak ◽

Jeffrey P. Kharoufeh

Keyword(s):

Computational Study ◽

Infinite Horizon ◽

Approximate Model ◽

Shared Environment ◽

Multiple Systems ◽

Current State ◽

Monotone Policies ◽

The Cost ◽

State Of The Environment ◽

Monotonicity Results

We consider the problem of optimally replacing multiple stochastically degrading systems using condition-based maintenance. Each system degrades continuously at a rate that is governed by the current state of the environment, and each fails once its own cumulative degradation threshold is reached. The objective is to minimize the sum of the expected total discounted setup, preventive replacement, reactive replacement, and downtime costs over an infinite horizon. For each environment state, we prove that the cost function is monotone nondecreasing in the cumulative degradation level. Additionally, under mild conditions, these monotonicity results are extended to the entire state space. In the case of a single system, we establish that monotone policies are optimal. The monotonicity results help facilitate a tractable, approximate model with state- and action-space transformations and a basis-function approximation of the action-value function. Our computational study demonstrates that high-quality, near-optimal policies are attainable and significantly outperform heuristic policies.

Download Full-text

Computing monotone policies for Markov decision processes by exploiting sparsity

2013 Australian Control Conference ◽

10.1109/aucc.2013.6697239 ◽

2013 ◽

Author(s):

Vikram Krishnamurthy ◽

Cristian R. Rojas ◽

Bo Wahlberg

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Monotone Policies

Download Full-text

Monotone Policies and Indexability for Bidirectional Restless Bandits

Advances in Applied Probability ◽

10.1239/aap/1363354103 ◽

2013 ◽

Vol 45 (1) ◽

pp. 51-85 ◽

Cited By ~ 4

Author(s):

K. D. Glazebrook ◽

D. J. Hodge ◽

C. Kirkbride

Keyword(s):

Lagrangian Relaxation ◽

Resource Constraints ◽

State Transitions ◽

Structural Requirement ◽

Cost Rate ◽

State Dependent ◽

Wide Range ◽

Decision Epoch ◽

Monotone Policies ◽

System Project

Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.

Download Full-text

Monotone Policies and Indexability for Bidirectional Restless Bandits

Advances in Applied Probability ◽

10.1017/s0001867800006194 ◽

2013 ◽

Vol 45 (01) ◽

pp. 51-85

Author(s):

K. D. Glazebrook ◽

D. J. Hodge ◽

C. Kirkbride

Keyword(s):

Lagrangian Relaxation ◽

Resource Constraints ◽

State Transitions ◽

Structural Requirement ◽

Cost Rate ◽

State Dependent ◽

Wide Range ◽

Decision Epoch ◽

Monotone Policies ◽

System Project

Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.

Download Full-text

${Q}$-Learning Algorithms for Constrained Markov Decision Processes With Randomized Monotone Policies: Application to MIMO Transmission Control

IEEE Transactions on Signal Processing ◽

10.1109/tsp.2007.893228 ◽

2007 ◽

Vol 55 (5) ◽

pp. 2170-2181 ◽

Cited By ~ 54

Author(s):

Dejan V. Djonin ◽

Vikram Krishnamurthy

Keyword(s):

Markov Decision Processes ◽

Learning Algorithms ◽

Decision Processes ◽

Transmission Control ◽

Q Learning ◽

Constrained Markov Decision Processes ◽

Markov Decision ◽

Monotone Policies ◽

Mimo Transmission

Download Full-text

Optimality of Monotone Policies for Transmission Control with Switching Costs

2007 46th IEEE Conference on Decision and Control ◽

10.1109/cdc.2007.4434998 ◽

2007 ◽

Cited By ~ 1

Author(s):

Arsalan Farrokh ◽

Vikram Krishnamurthy

Keyword(s):

Switching Costs ◽

Transmission Control ◽

Monotone Policies

Download Full-text

monotone policies
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach * *This work was partially supported by the Swedish Research Council under contract 2016-06079 and the Linnaeus Center ACCESS at KTH.

OPTIMALLY REPLACING MULTIPLE SYSTEMS IN A SHARED ENVIRONMENT

Computing monotone policies for Markov decision processes by exploiting sparsity

Monotone Policies and Indexability for Bidirectional Restless Bandits

Monotone Policies and Indexability for Bidirectional Restless Bandits

${Q}$-Learning Algorithms for Constrained Markov Decision Processes With Randomized Monotone Policies: Application to MIMO Transmission Control

Optimality of Monotone Policies for Transmission Control with Switching Costs

Export Citation Format

monotone policiesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach * *This work was partially supported by the Swedish Research Council under contract 2016-06079 and the Linnaeus Center ACCESS at KTH.

OPTIMALLY REPLACING MULTIPLE SYSTEMS IN A SHARED ENVIRONMENT

Computing monotone policies for Markov decision processes by exploiting sparsity

Monotone Policies and Indexability for Bidirectional Restless Bandits

Monotone Policies and Indexability for Bidirectional Restless Bandits

${Q}$-Learning Algorithms for Constrained Markov Decision Processes With Randomized Monotone Policies: Application to MIMO Transmission Control

Optimality of Monotone Policies for Transmission Control with Switching Costs

monotone policies
Recently Published Documents