Optimal stopping in a partially observable binary-valued markov chain with costly perfect information

George E. Monahan

doi:10.2307/3213917

Optimal stopping in a partially observable binary-valued markov chain with costly perfect information

Journal of Applied Probability ◽

10.1017/s0021900200028291 ◽

1982 ◽

Vol 19 (01) ◽

pp. 72-81 ◽

Cited By ~ 2

Author(s):

George E. Monahan

Keyword(s):

Markov Chain ◽

Optimal Stopping ◽

Optimal Policy ◽

Value Function ◽

Structural Information ◽

State Information ◽

Mild Conditions ◽

Optimal Value ◽

Markov Decision ◽

Partially Observable

The problem of optimal stopping in a Markov chain when there is imperfect state information is formulated as a partially observable Markov decision process. Properties of the optimal value function are developed. It is shown that under mild conditions the optimal policy is well structured. An efficient algorithm, which uses the structural information in the computation of the optimal policy, is presented.

Download Full-text

A Moreau-Yosida regularization for Markov decision processes

Proyecciones (Antofagasta) ◽

10.22199/issn.0717-6279-2021-01-0008 ◽

2020 ◽

Vol 40 (1) ◽

pp. 117-137

Author(s):

R. Israel Ortega-Gutiérrez ◽

H. Cruz-Suárez

Keyword(s):

Markov Decision Process ◽

Markov Decision Processes ◽

Optimal Policy ◽

Decision Process ◽

Value Function ◽

Decision Processes ◽

Original Process ◽

Optimal Value ◽

Markov Decision ◽

Yosida Regularization

This paper addresses a class of sequential optimization problems known as Markov decision processes. These kinds of processes are considered on Euclidean state and action spaces with the total expected discounted cost as the objective function. The main goal of the paper is to provide conditions to guarantee an adequate Moreau-Yosida regularization for Markov decision processes (named the original process). In this way, a new Markov decision process that conforms to the Markov control model of the original process except for the cost function induced via the Moreau-Yosida regularization is established. Compared to the original process, this new discounted Markov decision process has richer properties, such as the differentiability of its optimal value function, strictly convexity of the value function, uniqueness of optimal policy, and the optimal value function and the optimal policy of both processes, are the same. To complement the theory presented, an example is provided.

Download Full-text

On the control of a truncated general immigration process through the introduction of a predator

Journal of Applied Mathematics and Decision Sciences ◽

10.1155/jamds/2006/76398 ◽

2006 ◽

Vol 2006 ◽

pp. 1-12

Author(s):

E. G. Kyriakidis

Keyword(s):

Real Line ◽

Optimal Policy ◽

Average Cost ◽

Control Limit ◽

Decision Algorithm ◽

Mild Conditions ◽

Entire Real Line ◽

Immigration Process ◽

Markov Decision

This paper is concerned with the problem of controlling a truncated general immigration process, which represents a population of harmful individuals, by the introduction of a predator. If the parameters of the model satisfy some mild conditions, the existence of a control-limit policy that is average-cost optimal is proved. The proof is based on the uniformization technique and on the variation of a fictitious parameter over the entire real line. Furthermore, an efficient Markov decision algorithm is developed that generates a sequence of improving control-limit policies converging to the optimal policy.

Download Full-text

An analysis of transient Markov decision processes

Journal of Applied Probability ◽

10.1239/jap/1158784933 ◽

2006 ◽

Vol 43 (3) ◽

pp. 603-621 ◽

Cited By ~ 5

Author(s):

Huw W. James ◽

E. J. Collins

Keyword(s):

Markov Decision Processes ◽

Value Function ◽

Decision Processes ◽

Optimal Value Function ◽

Natural Form ◽

Convergence Results ◽

Optimal Value ◽

Finite State ◽

Markov Decision ◽

Bounded Below

This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. Whereas most previous analyses have restricted attention to the case where the set of states is finite, this paper analyses the case where the set of states is not necessarily finite or even countable. It is shown that all the existence, uniqueness, and convergence results of the finite-state case hold when the set of states is a general Borel space, provided we make the additional assumption that the optimal value function is bounded below. We give a sufficient condition for the optimal value function to be bounded below which holds, in particular, if the set of states is countable.

Download Full-text

Optimal Control of Partially Observable Semi-Markovian Failing Systems: An Analysis Using a Phase Methodology

Operations Research ◽

10.1287/opre.2020.2086 ◽

2021 ◽

Author(s):

Akram Khaleghei ◽

Michael Jong Kim

Keyword(s):

Optimal Control ◽

Control Policy ◽

Computational Approach ◽

Control Limit ◽

New Approach ◽

Problem Class ◽

Optimal Value ◽

Markov Decision ◽

Decision Epoch ◽

Partially Observable

In “Optimal Control of Partially Observable Semi-Markovian Failing Systems: An Analysis using a Phase Methodology,” Khaleghei and Kim study a maintenance control problem a as partially observable semi-Markov decision process (POSMDP), a problem class that is typically computationally intractable and not amenable to structural analysis. The authors develop a new approach based on a phase methodology where the idea is to view the intractable POSMDP as the limiting problem of a sequence of tractable POMDPs. They show that the optimal control policy can be represented as a control limit policy which monitors the estimated conditional reliability at each decision epoch, and, by exploiting this structure, an efficient computational approach to solve for the optimal control limit and corresponding optimal value is developed.

Download Full-text

An Optimal Stopping Problem for Jump Diffusion Logistic Population Model

Mathematical Problems in Engineering ◽

10.1155/2016/5839672 ◽

2016 ◽

Vol 2016 ◽

pp. 1-5 ◽

Cited By ~ 1

Author(s):

Yang Sun ◽

Xiaohui Ai

Keyword(s):

Optimal Stopping ◽

Critical State ◽

Explicit Solution ◽

Population Model ◽

Value Function ◽

Jump Diffusion ◽

Optimal Value Function ◽

Optimal Stopping Problem ◽

Optimal Value ◽

Poisson Jump

This paper examines an optimal stopping problem for the stochastic (Wiener-Poisson) jump diffusion logistic population model. We present an explicit solution to an optimal stopping problem of the stochastic (Wiener-Poisson) jump diffusion logistic population model by applying the smooth pasting technique (Dayanik and Karatzas, 2003; Dixit, 1993). We formulate this as an optimal stopping problem of maximizing the expected reward. We express the critical state of the optimal stopping region and the optimal value function explicitly.

Download Full-text

Task Scoping for Efficient Planning in Open Worlds (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7195 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13845-13846

Author(s):

Nishanth Kumar ◽

Michael Fishman ◽

Natasha Danas ◽

Stefanie Tellex ◽

Michael Littman ◽

...

Keyword(s):

Markov Decision Processes ◽

Value Function ◽

Decision Processes ◽

Initial State ◽

Open World ◽

Optimal Value ◽

Markov Decision ◽

Efficient Planning ◽

Action Spaces ◽

Action Variables

We propose an abstraction method for open-world environments expressed as Factored Markov Decision Processes (FMDPs) with very large state and action spaces. Our method prunes state and action variables that are irrelevant to the optimal value function on the state subspace the agent would visit when following any optimal policy from the initial state. This method thus enables tractable fast planning within large open-world FMDPs.

Download Full-text

AN OPTIMAL POLICY FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH NON-INDEPENDENT MONITORS

Advanced Reliability Modeling ◽

10.1142/9789812702685_0028 ◽

2004 ◽

Author(s):

LU JIN ◽

TOMOAKI MASHITA ◽

KAZUYUKI SUZUKI

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

The problem of optimal stopping in a partially observable Markov chain

Journal of Optimization Theory and Applications ◽

10.1007/bf00938445 ◽

1985 ◽

Vol 45 (3) ◽

pp. 425-442 ◽

Cited By ~ 8

Author(s):

T. Nakai

Keyword(s):

Markov Chain ◽

Optimal Stopping ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

An analysis of transient Markov decision processes

Journal of Applied Probability ◽

10.1017/s0021900200001972 ◽

2006 ◽

Vol 43 (03) ◽

pp. 603-621 ◽

Cited By ~ 2

Author(s):

Huw W. James ◽

E. J. Collins

Keyword(s):

Markov Decision Processes ◽

Value Function ◽

Decision Processes ◽

Optimal Value Function ◽

Natural Form ◽

Convergence Results ◽

Optimal Value ◽

Finite State ◽

Markov Decision ◽

Bounded Below

This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. Whereas most previous analyses have restricted attention to the case where the set of states is finite, this paper analyses the case where the set of states is not necessarily finite or even countable. It is shown that all the existence, uniqueness, and convergence results of the finite-state case hold when the set of states is a general Borel space, provided we make the additional assumption that the optimal value function is bounded below. We give a sufficient condition for the optimal value function to be bounded below which holds, in particular, if the set of states is countable.

Download Full-text