Exponential Convergence and Stability of Howard's Policy Improvement Algorithm for Controlled Diffusions

Bekzhan Kerimkulov; David Šiška; Lukasz Szpruch

doi:10.1137/19m1236758

Exponential Convergence and Stability of Howard's Policy Improvement Algorithm for Controlled Diffusions

SIAM Journal on Control and Optimization ◽

10.1137/19m1236758 ◽

2020 ◽

Vol 58 (3) ◽

pp. 1314-1340

Author(s):

Bekzhan Kerimkulov ◽

David Šiška ◽

Lukasz Szpruch

Keyword(s):

Exponential Convergence ◽

Policy Improvement ◽

Convergence And Stability ◽

Improvement Algorithm ◽

Controlled Diffusions

Download Full-text

On the policy improvement algorithm for ergodic risk-sensitive control

Proceedings of the Royal Society of Edinburgh Section A Mathematics ◽

10.1017/prm.2020.61 ◽

2020 ◽

pp. 1-26

Author(s):

Ari Arapostathis ◽

Anup Biswas ◽

Somnath Pradhan

Keyword(s):

Control Problem ◽

General Result ◽

Region Of Attraction ◽

Policy Improvement ◽

Improvement Algorithm ◽

Risk Sensitive ◽

Risk Sensitive Control ◽

Whole Space ◽

Controlled Diffusions ◽

Running Cost

In this article we consider the ergodic risk-sensitive control problem for a large class of multidimensional controlled diffusions on the whole space. We study the minimization and maximization problems under either a blanket stability hypothesis, or a near-monotone assumption on the running cost. We establish the convergence of the policy improvement algorithm for these models. We also present a more general result concerning the region of attraction of the equilibrium of the algorithm.

Download Full-text

A Class of Decision Processes Showing Policy-Improvement/Newton–Raphson Equivalence

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800001261 ◽

1989 ◽

Vol 3 (3) ◽

pp. 397-403 ◽

Cited By ~ 1

Author(s):

P. Whittle

Keyword(s):

Optimality Condition ◽

Regularity Condition ◽

Decision Processes ◽

Policy Improvement ◽

Improvement Algorithm ◽

Newton Raphson ◽

Raphson Algorithm

A condition expressed in Eq. (7) is given which, with one simplifying regularity condition, ensures that the policy-improvement algorithm is equivalent to application of the Newton–Raphson algorithm to an optimality condition. It is shown that this condition covers the two known cases of such equivalence, and another example is noted. The condition is believed to be necessary to within transformations of the problem, but this has not been proved.

Download Full-text

On the Complexity of the Policy Improvement Algorithm for Markov Decision Processes

INFORMS Journal on Computing ◽

10.1287/ijoc.6.2.188 ◽

1994 ◽

Vol 6 (2) ◽

pp. 188-192 ◽

Cited By ~ 16

Author(s):

Mary Melekopoglou ◽

Anne Condon

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Policy Improvement ◽

Improvement Algorithm ◽

Markov Decision

Download Full-text

How good is Howard's policy improvement algorithm?

Mathematical Methods of Operations Research ◽

10.1007/bf01918764 ◽

1985 ◽

Vol 29 (7) ◽

pp. 315-316 ◽

Cited By ~ 1

Author(s):

N. Schmitz

Keyword(s):

Policy Improvement ◽

Improvement Algorithm

Download Full-text

Global exponential convergence and stability of Wang neural network for solving online linear equations

Electronics Letters ◽

10.1049/el:20081928 ◽

2008 ◽

Vol 44 (2) ◽

pp. 145 ◽

Cited By ~ 26

Author(s):

Y. Zhang ◽

K. Chen

Keyword(s):

Neural Network ◽

Exponential Convergence ◽

Linear Equations ◽

Convergence And Stability ◽

Global Exponential Convergence

Download Full-text

A Policy Improvement Algorithm for Solving a Mixture Class of Perfect Information and AR-AT Semi-Markov Games

International Game Theory Review ◽

10.1142/s0219198920400083 ◽

2020 ◽

Vol 22 (02) ◽

pp. 2040008

Author(s):

P. Mondal ◽

S. K. Neogy ◽

A. Gupta ◽

D. Ghorui

Keyword(s):

Perfect Information ◽

Policy Improvement ◽

Markov Games ◽

Markov Game ◽

Improvement Algorithm ◽

Finite State ◽

Markov Decision ◽

Mixture Class ◽

Zero Sum ◽

Action Spaces

Zero-sum two-person discounted semi-Markov games with finite state and action spaces are studied where a collection of states having Perfect Information (PI) property is mixed with another collection of states having Additive Reward–Additive Transition and Action Independent Transition Time (AR-AT-AITT) property. For such a PI/AR-AT-AITT mixture class of games, we prove the existence of an optimal pure stationary strategy for each player. We develop a policy improvement algorithm for solving discounted semi-Markov decision processes (one player version of semi-Markov games) and using it we obtain a policy-improvement type algorithm for computing an optimal strategy pair of a PI/AR-AT-AITT mixture semi-Markov game. Finally, we extend our results when the states having PI property are replaced by a subclass of Switching Control (SC) states.

Download Full-text

Policy improvement algorithm for continuous time Markov decision processes with switching costs

Stochastic Control Theory and Stochastic Differential Systems - Lecture Notes in Control and Information Sciences ◽

10.1007/bfb0009393 ◽

2005 ◽

pp. 320-331 ◽

Cited By ~ 1

Author(s):

Bharat Doshi

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Switching Costs ◽

Decision Processes ◽

Policy Improvement ◽

Improvement Algorithm ◽

Markov Decision

Download Full-text

On the policy improvement algorithm in continuous time

Stochastics ◽

10.1080/17442508.2016.1187609 ◽

2016 ◽

Vol 89 (1) ◽

pp. 348-359 ◽

Cited By ~ 4

Author(s):

Saul D. Jacka ◽

Aleksandar Mijatović

Keyword(s):

Continuous Time ◽

Policy Improvement ◽

Improvement Algorithm

Download Full-text

Global exponential convergence and stability of gradient-based neural network for online matrix inversion

Applied Mathematics and Computation ◽

10.1016/j.amc.2009.06.048 ◽

2009 ◽

Vol 215 (3) ◽

pp. 1301-1306 ◽

Cited By ~ 32

Author(s):

Yunong Zhang ◽

Yanyan Shi ◽

Ke Chen ◽

Chaoli Wang

Keyword(s):

Neural Network ◽

Exponential Convergence ◽

Matrix Inversion ◽

Convergence And Stability ◽

Global Exponential Convergence ◽

Gradient Based

Download Full-text

Exponential convergence and stability of delayed fuzzy cellular neural networks with time-varying coefficients

Journal of Control Theory and Applications ◽

10.1007/s11768-011-8146-2 ◽

2011 ◽

Vol 9 (4) ◽

pp. 500-504 ◽

Cited By ~ 2

Author(s):

Manchun Tan

Keyword(s):

Neural Networks ◽

Exponential Convergence ◽

Cellular Neural Networks ◽

Time Varying ◽

Fuzzy Cellular Neural Networks ◽

Convergence And Stability ◽

Varying Coefficients ◽

Time Varying Coefficients

Download Full-text