Optimal decision procedures for finite Markov chains. Part II: Communicating systems

1973 ◽  
Vol 5 (3) ◽  
pp. 521-540 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.

1973 ◽  
Vol 5 (03) ◽  
pp. 521-540 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.


1973 ◽  
Vol 5 (02) ◽  
pp. 328-339 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a prescribed set depending on the state occupied at any time. Given the immediate cost for each choice, it is required to minimise the expected cost over an infinite future, without discounting. Various techniques are reviewed for the case when there is a finite set of possible transition matrices and an example is given to illustrate the unpredictable behaviour of policy sequences derived by backward induction. Further examples show that the existing methods may break down when there is an infinite family of transition matrices. A new approach is suggested, based on the idea of classifying the states according to their accessibility from one another.


1973 ◽  
Vol 5 (2) ◽  
pp. 328-339 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a prescribed set depending on the state occupied at any time. Given the immediate cost for each choice, it is required to minimise the expected cost over an infinite future, without discounting. Various techniques are reviewed for the case when there is a finite set of possible transition matrices and an example is given to illustrate the unpredictable behaviour of policy sequences derived by backward induction. Further examples show that the existing methods may break down when there is an infinite family of transition matrices. A new approach is suggested, based on the idea of classifying the states according to their accessibility from one another.


2018 ◽  
Vol 55 (4) ◽  
pp. 1025-1036 ◽  
Author(s):  
Dario Bini ◽  
Jeffrey J. Hunter ◽  
Guy Latouche ◽  
Beatrice Meini ◽  
Peter Taylor

Abstract In their 1960 book on finite Markov chains, Kemeny and Snell established that a certain sum is invariant. The value of this sum has become known as Kemeny’s constant. Various proofs have been given over time, some more technical than others. We give here a very simple physical justification, which extends without a hitch to continuous-time Markov chains on a finite state space. For Markov chains with denumerably infinite state space, the constant may be infinite and even if it is finite, there is no guarantee that the physical argument will hold. We show that the physical interpretation does go through for the special case of a birth-and-death process with a finite value of Kemeny’s constant.


1967 ◽  
Vol 4 (1) ◽  
pp. 192-196 ◽  
Author(s):  
J. N. Darroch ◽  
E. Seneta

In a recent paper, the authors have discussed the concept of quasi-stationary distributions for absorbing Markov chains having a finite state space, with the further restriction of discrete time. The purpose of the present note is to summarize the analogous results when the time parameter is continuous.


1980 ◽  
Vol 17 (03) ◽  
pp. 726-734 ◽  
Author(s):  
Bharat Doshi ◽  
Steven E. Shreve

A controlled Markov chain with finite state space has transition probabilities which depend on an unknown parameter α lying in a known finite set A. For each α, a stationary control law ϕ α is given. This paper develops a control scheme whereby at each stage t a parameter α t is chosen at random from among those parameters which nearly maximize the log likelihood function, and the control ut is chosen according to the control law ϕ αt. It is proved that this algorithm leads to identification of the true α under conditions weaker than any previously considered.


1976 ◽  
Vol 13 (4) ◽  
pp. 696-706 ◽  
Author(s):  
David Burman

Particles enter a finite-state system and move according to independent sample paths from a semi-Markov process. Strong limit theorems are developed for the ratio of the flow of particles from states i to j and the flow out of When the cumulative arrival of particles into the system up to time t, A (t) ∼ λtα, then a.s. When A (t)∼ λekt, then the flow between states must be normalized by the Laplace–Stieltjes transform of the conditional holding time distribution, in order to make the ratio an unbiased estimator of ρij.


1973 ◽  
Vol 5 (3) ◽  
pp. 541-553 ◽  
Author(s):  
John Bather

This paper is concerned with the general problem of finding an optimal transition matrix for a finite Markov chain, where the probabilities for each transition must be chosen from a given convex family of distributions. The immediate cost is determined by this choice, but it is required to minimise the average expected cost in the long run. The problem is investigated by classifying the states according to the accessibility relations between them. If an optimal policy exists, it can be found by considering the convex subsystems associated with the states at different levels in the classification scheme.


1994 ◽  
Vol 8 (1) ◽  
pp. 1-19 ◽  
Author(s):  
Madhav Desai ◽  
Sunil Kumar ◽  
P. R. Kumar

We consider time-inhomogeneous Markov chains on a finite state-space, whose transition probabilitiespij(t) = cijε(t)Vij are proportional to powers of a vanishing small parameter ε(t). We determine the precise relationship between this chain and the corresponding time-homogeneous chains pij= cijε(t)vij, as ε ↘ 0. Let {} be the steady-state distribution of this time-homogeneous chain. We characterize the orders {ηι} in = θ(εηι). We show that if ε(t) ↘ 0 slowly enough, then the timewise occupation measures βι := sup { q > 0 | Prob(x(t) = i) = + ∞}, called the recurrence orders, satisfy βi — βj = ηj — ηi. Moreover, : = { ηι|ηι = minj} is the set of ground states of the time-homogeneous chain, then x(t) → . in an appropriate sense, whenever η(t) is “cooled” slowly. We also show that there exists a critical ρ* such that x(t) → if and only if = + ∞. We characterize this critical rate as ρ* = max.min min max. Finally, we provide a graph algorithm for determining the orders [ηi] [βi] and the critical rate ρ*.


1980 ◽  
Vol 17 (3) ◽  
pp. 726-734 ◽  
Author(s):  
Bharat Doshi ◽  
Steven E. Shreve

A controlled Markov chain with finite state space has transition probabilities which depend on an unknown parameter α lying in a known finite set A. For each α, a stationary control law ϕ α is given. This paper develops a control scheme whereby at each stage t a parameter α t is chosen at random from among those parameters which nearly maximize the log likelihood function, and the control ut is chosen according to the control law ϕ αt. It is proved that this algorithm leads to identification of the true α under conditions weaker than any previously considered.


Sign in / Sign up

Export Citation Format

Share Document