Simulation‐based Uniform Value Function Estimates of Markov Decision Processes

This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. Whereas most previous analyses have restricted attention to the case where the set of states is finite, this paper analyses the case where the set of states is not necessarily finite or even countable. It is shown that all the existence, uniqueness, and convergence results of the finite-state case hold when the set of states is a general Borel space, provided we make the additional assumption that the optimal value function is bounded below. We give a sufficient condition for the optimal value function to be bounded below which holds, in particular, if the set of states is countable.

Download Full-text

Task Scoping for Efficient Planning in Open Worlds (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7195 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13845-13846

Author(s):

Nishanth Kumar ◽

Michael Fishman ◽

Natasha Danas ◽

Stefanie Tellex ◽

Michael Littman ◽

...

Keyword(s):

Markov Decision Processes ◽

Value Function ◽

Decision Processes ◽

Initial State ◽

Open World ◽

Optimal Value ◽

Markov Decision ◽

Efficient Planning ◽

Action Spaces ◽

Action Variables

We propose an abstraction method for open-world environments expressed as Factored Markov Decision Processes (FMDPs) with very large state and action spaces. Our method prunes state and action variables that are irrelevant to the optimal value function on the state subspace the agent would visit when following any optimal policy from the initial state. This method thus enables tractable fast planning within large open-world FMDPs.

Download Full-text

A survey of some simulation-based algorithms for Markov decision processes

Communications in Information and Systems ◽

10.4310/cis.2007.v7.n1.a4 ◽

2007 ◽

Vol 7 (1) ◽

pp. 59-92 ◽

Cited By ~ 8

Author(s):

Hyeong Soo Chang ◽

Michael C. Fu ◽

Jiaqiao Hu ◽

Steven I. Marcus

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Simulation Based ◽

Markov Decision

Download Full-text

Kernel Taylor-Based Value Function Approximation for Continuous-State Markov Decision Processes

Robotics: Science and Systems XVI ◽

10.15607/rss.2020.xvi.050 ◽

2020 ◽

Author(s):

Junhong Xu ◽

Kai Yin ◽

Lantao Liu

Keyword(s):

Markov Decision Processes ◽

Function Approximation ◽

Value Function ◽

Decision Processes ◽

Value Function Approximation ◽

Continuous State ◽

Markov Decision

Download Full-text

A Two-Timescale Simulation-Based Gradient Algorithm for Weighted Cost Markov Decision Processes

Proceedings of the 44th IEEE Conference on Decision and Control ◽

10.1109/cdc.2005.1583460 ◽

2006 ◽

Cited By ~ 1

Author(s):

Ying He ◽

M.C. Fu ◽

S.I. Marcus

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Gradient Algorithm ◽

Simulation Based ◽

Markov Decision

Download Full-text

Simulation-Based Algorithms for Markov Decision Processes

10.1007/978-1-4471-5022-0 ◽

2013 ◽

Cited By ~ 15

Author(s):

Hyeong Soo Chang ◽

Jiaqiao Hu ◽

Michael C. Fu ◽

Steven I. Marcus

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Simulation Based ◽

Markov Decision

Download Full-text

Simulation-based policy generation using large-scale Markov decision processes

IEEE Transactions on Systems Man and Cybernetics - Part A Systems and Humans ◽

10.1109/3468.983417 ◽

2001 ◽

Vol 31 (6) ◽

pp. 609-622 ◽

Cited By ~ 2

Author(s):

C.W. Zobel ◽

W.T. Scherer

Keyword(s):

Markov Decision Processes ◽

Large Scale ◽

Decision Processes ◽

Simulation Based ◽

Markov Decision

Download Full-text

Technical Note—Cyclic Variables and Markov Decision Processes

Operations Research ◽

10.1287/opre.2019.1913 ◽

2020 ◽

Vol 68 (4) ◽

pp. 1231-1237

Author(s):

Avery Haviv

Keyword(s):

Markov Decision Processes ◽

Value Function ◽

Computational Cost ◽

Technical Note ◽

Decision Processes ◽

Computational Burden ◽

Value Function Iteration ◽

Markov Decision ◽

Specific Order ◽

Standard Value

Markov decision processes are commonly used to model forward-looking behavior. However, cyclic terms, including seasonality, are often omitted from these models because of the increase in computational burden. This paper develops a cyclic value function iteration (CVFI), an adjustment to the standard value function iteration. By updating states in a specific order, CVFI allows cyclic variables to be included in the state space with no increase in the computational cost. This result is proved theoretically and shown to hold closely in Monte Carlo simulations.

Download Full-text