scholarly journals Heuristic Search for Planning with Different Forced Goal-Ordering Constraints

2013 ◽  
Vol 2013 ◽  
pp. 1-11
Author(s):  
Jiangfeng Luo ◽  
Weiming Zhang ◽  
Jing Cui ◽  
Cheng Zhu ◽  
Jincai Huang ◽  
...  

Planning with forced goal-ordering (FGO) constraints has been proposed many times over the years, but there are still major difficulties in realizing these FGOs in plan generation. In certain planning domains, all the FGOs exist in the initial state. No matter which approach is adopted to achieve a subgoal, all the subgoals should be achieved in a given sequence from the initial state. Otherwise, the planning may arrive at a deadlock. For some other planning domains, there is no FGO in the initial state. However, FGO may occur during the planning process if certain subgoal is achieved by an inappropriate approach. This paper contributes to illustrate that it is the excludable constraints among the goal achievement operations (GAO) of different subgoals that introduce the FGOs into the planning problem, and planning with FGO is still a challenge for the heuristic search based planners. Then, a novel multistep forward search algorithm is proposed which can solve the planning problem with different FGOs efficiently.

2015 ◽  
Vol 54 ◽  
pp. 123-158
Author(s):  
Scott Kiesel ◽  
Ethan Burns ◽  
Wheeler Ruml

In real-time domains such as video games, planning happens concurrently with execution and the planning algorithm has a strictly bounded amount of time before it must return the next action for the agent to execute. We explore the use of real-time heuristic search in two benchmark domains inspired by video games. Unlike classic benchmarks such as grid pathfinding and the sliding tile puzzle, these new domains feature exogenous change and directed state space graphs. We consider the setting in which planning and acting are concurrent and we use the natural objective of minimizing goal achievement time. Using both the classic benchmarks and the new domains, we investigate several enhancements to a leading real-time search algorithm, LSS-LRTA*. We show experimentally that 1) it is better to plan after each action or to use a dynamically sized lookahead, 2) A*-based lookahead can cause undesirable actions to be selected, and 3) on-line de-biasing of the heuristic can lead to improved performance. We hope this work encourages future research on applying real-time search in dynamic domains.


This is a review of the descriptive research that has a supportive role (alongside the literature review) in describing the initial state of the strategic planning problem within the normative research model. There is a description of the two surveys concerning the planning process and plan usability and descriptions of the documentation and executive interviews that have been examined. There are also details of the findings from this research, with some data analysis and the important conclusions that relate to the main aim of the research.


2018 ◽  
Author(s):  
Elthon Manhas De Freitas ◽  
Karina Valdivia Delgado ◽  
Valdinei Freire

Markov Decision Process (MDP) has been used very efficiently to solve sequential decision-making problems. However, there are problems in which dealing with the risks of the environment to obtain a reliable result is more important than minimizing the total expected cost. MDPs that deal with this type of problem are called risk-sensitive Markov decision processes (RSMDP). In this paper we propose an efficient heuristic search algorithm that allows to obtain a solution by evaluating only the relevant states to reach the goal states starting from an initial state.


Author(s):  
Felipe Trevizan ◽  
Sylvie Thiebaux ◽  
Pedro Santana ◽  
Brian Williams

We consider the problem of generating optimal stochastic policies for Constrained Stochastic Shortest Path problems, which are a natural model for planning under uncertainty for resource-bounded agents with multiple competing objectives. While unconstrained SSPs enjoy a multitude of efficient heuristic search solution methods with the ability to focus on promising areas reachable from the initial state, the state of the art for constrained SSPs revolves around linear and dynamic programming algorithms which explore the entire state space. In this paper, we present i-dual, the first heuristic search algorithm for constrained SSPs. To concisely represent constraints and efficiently decide their violation, i-dual operates in the space of dual variables describing the policy occupation measures. It does so while retaining the ability to use standard value function heuristics computed by well-known methods. Our experiments show that these features enable i-dual to achieve up to two orders of magnitude improvement in run-time and memory over linear programming algorithms.


2006 ◽  
Vol 27 ◽  
pp. 419-439 ◽  
Author(s):  
S. Hoelldobler ◽  
E. Karabaev ◽  
O. Skvortsova

We present a heuristic search algorithm for solving first-order Markov Decision Processes (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, in contrast to existing systems, which start with propositionalizing the FOMDP and then perform state abstraction on its propositionalized version we apply state abstraction directly on the FOMDP avoiding propositionalization. This kind of abstraction is referred to as first-order state abstraction. Secondly, guided by an admissible heuristic, the search is restricted to those states that are reachable from the initial state. We demonstrate the usefulness of the above techniques for solving FOMDPs with a system, referred to as FluCaP (formerly, FCPlanner), that entered the probabilistic track of the 2004 International Planning Competition (IPC'2004) and demonstrated an advantage over other planners on the problems represented in first-order terms.


Author(s):  
Xi Chen ◽  
Yining Wang ◽  
Yuan Zhou

We study the dynamic assortment planning problem, where for each arriving customer, the seller offers an assortment of substitutable products and the customer makes the purchase among offered products according to an uncapacitated multinomial logit (MNL) model. Because all the utility parameters of the MNL model are unknown, the seller needs to simultaneously learn customers’ choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or, equivalently, to minimize the expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products [Formula: see text]. The optimal regret of the dynamic assortment planning problem under the most basic and popular choice model—the MNL model—is still open. By carefully analyzing a revenue potential function, we develop a trisection-based policy combined with adaptive confidence bound construction, which achieves an item-independent regret bound of [Formula: see text], where [Formula: see text] is the length of selling horizon. We further establish the matching lower bound result to show the optimality of our policy. There are two major advantages of the proposed policy. First, the regret of all our policies has no dependence on [Formula: see text]. Second, our policies are almost assumption-free: there is no assumption on mean utility nor any “separability” condition on the expected revenues for different assortments. We also extend our trisection search algorithm to capacitated MNL models and obtain the optimal regret [Formula: see text] (up to logrithmic factors) without any assumption on the mean utility parameters of items.


Sign in / Sign up

Export Citation Format

Share Document