Optimal stopping in a partially observable binary-valued markov chain with costly perfect information

1982 ◽  
Vol 19 (1) ◽  
pp. 72-81 ◽  
Author(s):  
George E. Monahan

The problem of optimal stopping in a Markov chain when there is imperfect state information is formulated as a partially observable Markov decision process. Properties of the optimal value function are developed. It is shown that under mild conditions the optimal policy is well structured. An efficient algorithm, which uses the structural information in the computation of the optimal policy, is presented.

1982 ◽  
Vol 19 (01) ◽  
pp. 72-81 ◽  
Author(s):  
George E. Monahan

The problem of optimal stopping in a Markov chain when there is imperfect state information is formulated as a partially observable Markov decision process. Properties of the optimal value function are developed. It is shown that under mild conditions the optimal policy is well structured. An efficient algorithm, which uses the structural information in the computation of the optimal policy, is presented.


2020 ◽  
Vol 40 (1) ◽  
pp. 117-137
Author(s):  
R. Israel Ortega-Gutiérrez ◽  
H. Cruz-Suárez

This paper addresses a class of sequential optimization problems known as Markov decision processes. These kinds of processes are considered on Euclidean state and action spaces with the total expected discounted cost as the objective function. The main goal of the paper is to provide conditions to guarantee an adequate Moreau-Yosida regularization for Markov decision processes (named the original process). In this way, a new Markov decision process that conforms to the Markov control model of the original process except for the cost function induced via the Moreau-Yosida regularization is established. Compared to the original process, this new discounted Markov decision process has richer properties, such as the differentiability of its optimal value function, strictly convexity of the value function, uniqueness of optimal policy, and the optimal value function and the optimal policy of both processes, are the same. To complement the theory presented, an example is provided.


2006 ◽  
Vol 2006 ◽  
pp. 1-12
Author(s):  
E. G. Kyriakidis

This paper is concerned with the problem of controlling a truncated general immigration process, which represents a population of harmful individuals, by the introduction of a predator. If the parameters of the model satisfy some mild conditions, the existence of a control-limit policy that is average-cost optimal is proved. The proof is based on the uniformization technique and on the variation of a fictitious parameter over the entire real line. Furthermore, an efficient Markov decision algorithm is developed that generates a sequence of improving control-limit policies converging to the optimal policy.


2006 ◽  
Vol 43 (3) ◽  
pp. 603-621 ◽  
Author(s):  
Huw W. James ◽  
E. J. Collins

This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. Whereas most previous analyses have restricted attention to the case where the set of states is finite, this paper analyses the case where the set of states is not necessarily finite or even countable. It is shown that all the existence, uniqueness, and convergence results of the finite-state case hold when the set of states is a general Borel space, provided we make the additional assumption that the optimal value function is bounded below. We give a sufficient condition for the optimal value function to be bounded below which holds, in particular, if the set of states is countable.


2021 ◽  
Author(s):  
Akram Khaleghei ◽  
Michael Jong Kim

In “Optimal Control of Partially Observable Semi-Markovian Failing Systems: An Analysis using a Phase Methodology,” Khaleghei and Kim study a maintenance control problem a as partially observable semi-Markov decision process (POSMDP), a problem class that is typically computationally intractable and not amenable to structural analysis. The authors develop a new approach based on a phase methodology where the idea is to view the intractable POSMDP as the limiting problem of a sequence of tractable POMDPs. They show that the optimal control policy can be represented as a control limit policy which monitors the estimated conditional reliability at each decision epoch, and, by exploiting this structure, an efficient computational approach to solve for the optimal control limit and corresponding optimal value is developed.


2016 ◽  
Vol 2016 ◽  
pp. 1-5 ◽  
Author(s):  
Yang Sun ◽  
Xiaohui Ai

This paper examines an optimal stopping problem for the stochastic (Wiener-Poisson) jump diffusion logistic population model. We present an explicit solution to an optimal stopping problem of the stochastic (Wiener-Poisson) jump diffusion logistic population model by applying the smooth pasting technique (Dayanik and Karatzas, 2003; Dixit, 1993). We formulate this as an optimal stopping problem of maximizing the expected reward. We express the critical state of the optimal stopping region and the optimal value function explicitly.


2020 ◽  
Vol 34 (10) ◽  
pp. 13845-13846
Author(s):  
Nishanth Kumar ◽  
Michael Fishman ◽  
Natasha Danas ◽  
Stefanie Tellex ◽  
Michael Littman ◽  
...  

We propose an abstraction method for open-world environments expressed as Factored Markov Decision Processes (FMDPs) with very large state and action spaces. Our method prunes state and action variables that are irrelevant to the optimal value function on the state subspace the agent would visit when following any optimal policy from the initial state. This method thus enables tractable fast planning within large open-world FMDPs.


2006 ◽  
Vol 43 (03) ◽  
pp. 603-621 ◽  
Author(s):  
Huw W. James ◽  
E. J. Collins

This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. Whereas most previous analyses have restricted attention to the case where the set of states is finite, this paper analyses the case where the set of states is not necessarily finite or even countable. It is shown that all the existence, uniqueness, and convergence results of the finite-state case hold when the set of states is a general Borel space, provided we make the additional assumption that the optimal value function is bounded below. We give a sufficient condition for the optimal value function to be bounded below which holds, in particular, if the set of states is countable.


Sign in / Sign up

Export Citation Format

Share Document