scholarly journals FluCaP: A Heuristic Search Planner for First-Order MDPs

2006 ◽  
Vol 27 ◽  
pp. 419-439 ◽  
Author(s):  
S. Hoelldobler ◽  
E. Karabaev ◽  
O. Skvortsova

We present a heuristic search algorithm for solving first-order Markov Decision Processes (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, in contrast to existing systems, which start with propositionalizing the FOMDP and then perform state abstraction on its propositionalized version we apply state abstraction directly on the FOMDP avoiding propositionalization. This kind of abstraction is referred to as first-order state abstraction. Secondly, guided by an admissible heuristic, the search is restricted to those states that are reachable from the initial state. We demonstrate the usefulness of the above techniques for solving FOMDPs with a system, referred to as FluCaP (formerly, FCPlanner), that entered the probabilistic track of the 2004 International Planning Competition (IPC'2004) and demonstrated an advantage over other planners on the problems represented in first-order terms.

2018 ◽  
Author(s):  
Elthon Manhas De Freitas ◽  
Karina Valdivia Delgado ◽  
Valdinei Freire

Markov Decision Process (MDP) has been used very efficiently to solve sequential decision-making problems. However, there are problems in which dealing with the risks of the environment to obtain a reliable result is more important than minimizing the total expected cost. MDPs that deal with this type of problem are called risk-sensitive Markov decision processes (RSMDP). In this paper we propose an efficient heuristic search algorithm that allows to obtain a solution by evaluating only the relevant states to reach the goal states starting from an initial state.


2005 ◽  
Vol 24 ◽  
pp. 933-944 ◽  
Author(s):  
B. Bonet ◽  
H. Geffner

We describe the version of the GPT planner used in the probabilistic track of the 4th International Planning Competition (IPC-4). This version, called mGPT, solves Markov Decision Processes specified in the PPDDL language by extracting and using different classes of lower bounds along with various heuristic-search algorithms. The lower bounds are extracted from deterministic relaxations where the alternative probabilistic effects of an action are mapped into different, independent, deterministic actions. The heuristic-search algorithms use these lower bounds for focusing the updates and delivering a consistent value function over all states reachable from the initial state and the greedy policy.


Author(s):  
Felipe Trevizan ◽  
Sylvie Thiebaux ◽  
Pedro Santana ◽  
Brian Williams

We consider the problem of generating optimal stochastic policies for Constrained Stochastic Shortest Path problems, which are a natural model for planning under uncertainty for resource-bounded agents with multiple competing objectives. While unconstrained SSPs enjoy a multitude of efficient heuristic search solution methods with the ability to focus on promising areas reachable from the initial state, the state of the art for constrained SSPs revolves around linear and dynamic programming algorithms which explore the entire state space. In this paper, we present i-dual, the first heuristic search algorithm for constrained SSPs. To concisely represent constraints and efficiently decide their violation, i-dual operates in the space of dual variables describing the policy occupation measures. It does so while retaining the ability to use standard value function heuristics computed by well-known methods. Our experiments show that these features enable i-dual to achieve up to two orders of magnitude improvement in run-time and memory over linear programming algorithms.


Author(s):  
Bar Light

In multiperiod stochastic optimization problems, the future optimal decision is a random variable whose distribution depends on the parameters of the optimization problem. I analyze how the expected value of this random variable changes as a function of the dynamic optimization parameters in the context of Markov decision processes. I call this analysis stochastic comparative statics. I derive both comparative statics results and stochastic comparative statics results showing how the current and future optimal decisions change in response to changes in the single-period payoff function, the discount factor, the initial state of the system, and the transition probability function. I apply my results to various models from the economics and operations research literature, including investment theory, dynamic pricing models, controlled random walks, and comparisons of stationary distributions.


Sign in / Sign up

Export Citation Format

Share Document