Efficient Planning under Uncertainty with Macro-actions

Journal of Artificial Intelligence Research ◽

10.1613/jair.3171 ◽

2011 ◽

Vol 40 ◽

pp. 523-570 ◽

Cited By ~ 44

Author(s):

R. He ◽

E. Brunskill ◽

N. Roy

Keyword(s):

Search Algorithm ◽

Formal Analysis ◽

Good Control ◽

Planning Under Uncertainty ◽

Scientific Exploration ◽

Efficient Planning ◽

Novel Method ◽

Target Monitoring ◽

Partially Observable ◽

Good Sequences

Deciding how to act in partially observable environments remains an active area of research. Identifying good sequences of decisions is particularly challenging when good control performance requires planning multiple steps into the future in domains with many states. Towards addressing this challenge, we present an online, forward-search algorithm called the Posterior Belief Distribution (PBD). PBD leverages a novel method for calculating the posterior distribution over beliefs that result after a sequence of actions is taken, given the set of observation sequences that could be received during this process. This method allows us to efficiently evaluate the expected reward of a sequence of primitive actions, which we refer to as macro-actions. We present a formal analysis of our approach, and examine its performance on two very large simulation experiments: scientific exploration and a target monitoring domain. We also demonstrate our algorithm being used to control a real robotic helicopter in a target monitoring experiment, which suggests that our approach has practical potential for planning in real-world, large partially observable domains where a multi-step lookahead is required to achieve good performance.

Download Full-text

Inductive Synthesis for Probabilistic Programs Reaches New Horizons

Tools and Algorithms for the Construction and Analysis of Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-72016-2_11 ◽

2021 ◽

pp. 191-209

Author(s):

Roman Andriushchenko ◽

Milan Češka ◽

Sebastian Junges ◽

Joost-Pieter Katoen

Keyword(s):

Deductive Reasoning ◽

Synthesis Process ◽

Worst Case ◽

New Horizons ◽

The Family ◽

Pruning Strategy ◽

Finite State ◽

Novel Method ◽

Partially Observable ◽

Probabilistic Programs

AbstractThis paper presents a novel method for the automated synthesis of probabilistic programs. The starting point is a program sketch representing a finite family of finite-state Markov chains with related but distinct topologies, and a reachability specification. The method builds on a novel inductive oracle that greedily generates counter-examples (CEs) for violating programs and uses them to prune the family. These CEs leverage the semantics of the family in the form of bounds on its best- and worst-case behaviour provided by a deductive oracle using an MDP abstraction. The method further monitors the performance of the synthesis and adaptively switches between inductive and deductive reasoning. Our experiments demonstrate that the novel CE construction provides a significantly faster and more effective pruning strategy leading to an accelerated synthesis process on a wide range of benchmarks. For challenging problems, such as the synthesis of decentralized partially-observable controllers, we reduce the run-time from a day to minutes.

Download Full-text

A local search algorithm for plug-and-play reconfiguration of interconnected systems

at - Automatisierungstechnik ◽

10.1515/auto-2016-0045 ◽

2016 ◽

Vol 64 (12) ◽

Author(s):

Sven Bodenburg ◽

Jan Lunze

Keyword(s):

Search Algorithm ◽

Interconnected Systems ◽

Local Algorithm ◽

Local Search Algorithm ◽

Plug And Play ◽

Interconnected System ◽

Actuator Failures ◽

Tank System ◽

Novel Method ◽

Set Up

AbstractThis paper proposes a novel method to organise the reconfiguration process of decentralised controllers after actuator failures have occurred in an interconnected system. If an actuator fails in a subsystem, only the corresponding control station should be reconfigured, although the fault has effects on other subsystems through the physical couplings. The focus of this paper is on the organisation of the reconfiguration process without a central coordinator. Design agents exist for each of the subsystems which store the subsystem model. A local algorithm is presented to gather models from neighbouring design agents with the aim to set-up a model which describes the behaviour of the faulty subsystem including its neighbours. Furthermore, local reconfiguration conditions are proposed to design a virtual actuator so as to guarantee stability of the overall system. As a consequence, the design agents “play” together to gather the model of the faulty subsystem before the reconfigured control station is “plugged-in” the control hardware. Plug-and-play reconfiguration is illustrated by an interconnected tank system.

Download Full-text

A Fast Coalition Structure Search Algorithm for Modular Robot Reconfiguration Planning under Uncertainty

Springer Tracts in Advanced Robotics - Distributed Autonomous Robotic Systems ◽

10.1007/978-3-642-55146-8_13 ◽

2014 ◽

pp. 177-191

Author(s):

Ayan Dutta ◽

Prithviraj Dasgupta ◽

Jose Baca ◽

Carl Nelson

Keyword(s):

Search Algorithm ◽

Coalition Structure ◽

Planning Under Uncertainty ◽

Modular Robot ◽

Structure Search ◽

Reconfiguration Planning

Download Full-text

Learning Diverse Bayesian Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017793 ◽

2019 ◽

Vol 33 ◽

pp. 7793-7800

Author(s):

Cong Chen ◽

Changhe Yuan

Keyword(s):

Bayesian Networks ◽

Bayesian Network ◽

Network Structure ◽

Search Algorithm ◽

Noisy Data ◽

Local Neighborhood ◽

Causal Ordering ◽

True Model ◽

Underlying Network ◽

Novel Method

Much effort has been directed at developing algorithms for learning optimal Bayesian network structures from data. When given limited or noisy data, however, the optimal Bayesian network often fails to capture the true underlying network structure. One can potentially address the problem by finding multiple most likely Bayesian networks (K-Best) in the hope that one of them recovers the true model. However, it is often the case that some of the best models come from the same peak(s) and are very similar to each other; so they tend to fail together. Moreover, many of these models are not even optimal respective to any causal ordering, thus unlikely to be useful. This paper proposes a novel method for finding a set of diverse top Bayesian networks, called modes, such that each network is guaranteed to be optimal in a local neighborhood. Such mode networks are expected to provide a much better coverage of the true model. Based on a globallocal theorem showing that a mode Bayesian network must be optimal in all local scopes, we introduce an A* search algorithm to efficiently find top M Bayesian networks which are highly probable and naturally diverse. Empirical evaluations show that our top mode models have much better diversity as well as accuracy in discovering true underlying models than those found by K-Best.

Download Full-text

An Improved Novel Global Harmony Search Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.2169 ◽

2014 ◽

Vol 644-650 ◽

pp. 2169-2172

Author(s):

Zhi Kong ◽

Guo Dong Zhang ◽

Li Fu Wang

Keyword(s):

Convergence Rate ◽

Optimization Problems ◽

Search Algorithm ◽

Harmony Search ◽

Global Optimum ◽

Harmony Search Algorithm ◽

Test Functions ◽

Benchmark Test ◽

Novel Method ◽

Benchmark Test Functions

This paper develops an improved novel global harmony search (INGHS) algorithm for solving optimization problems. INGHS employs a novel method for generating new solution vectors that enhances accuracy and convergence rate of novel global harmony search (NGHS) algorithm. Simulations for five benchmark test functions show that INGHS possesses better ability to find the global optimum than that of harmony search (HS) algorithm. Compared with NGHS and HS, INGHS is better in terms of robustness and efficiency.

Download Full-text

Modified Cuckoo Search Algorithm: A Novel Method to Minimize the Fuel Cost

Energies ◽

10.3390/en11061328 ◽

2018 ◽

Vol 11 (6) ◽

pp. 1328 ◽

Cited By ~ 8

Author(s):

Thang Nguyen ◽

Dieu Vo ◽

Nguyen Vu Quynh ◽

Le Van Dai

Keyword(s):

Search Algorithm ◽

Cuckoo Search ◽

Cuckoo Search Algorithm ◽

Fuel Cost ◽

Novel Method

Download Full-text

A Novel Method for Source Tracking of Chemical Gas Leakage: Outlier Mutation Optimization Algorithm

Sensors ◽

10.3390/s22010071 ◽

2021 ◽

Vol 22 (1) ◽

pp. 71

Author(s):

Zhiyu Xia ◽

Zhengyi Xu ◽

Dan Li ◽

Jianming Wei

Keyword(s):

Swarm Intelligence ◽

Optimization Algorithm ◽

Search Algorithm ◽

Test Session ◽

Source Tracking ◽

Gas Leakage ◽

Industrial Parks ◽

Source Tracing ◽

Direct Search Algorithm ◽

Novel Method

Chemical industrial parks, which act as critical infrastructures in many cities, need to be responsive to chemical gas leakage accidents. Once a chemical gas leakage accident occurs, risks of poisoning, fire, and explosion will follow. In order to meet the primary emergency response demands in chemical gas leakage accidents, source tracking technology of chemical gas leakage has been proposed and evolved. This paper proposes a novel method, Outlier Mutation Optimization (OMO) algorithm, aimed to quickly and accurately track the source of chemical gas leakage. The OMO algorithm introduces a random walk exploration mode and, based on Swarm Intelligence (SI), increases the probability of individual mutation. Compared with other optimization algorithms, the OMO algorithm has the advantages of a wider exploration range and more convergence modes. In the algorithm test session, a series of chemical gas leakage accident application examples with random parameters are first assumed based on the Gaussian plume model; next, the qualitative experiments and analysis of the OMO algorithm are conducted, based on the application example. The test results show that the OMO algorithm with default parameters has superior comprehensive performance, including the extremely high average calculation accuracy: the optimal value, which represents the error between the final objective function value obtained by the optimization algorithm and the ideal value, reaches 2.464e-15 when the number of sensors is 16; 2.356e-13 when the number of sensors is 9; and 5.694e-23 when the number of sensors is 4. There is a satisfactory calculation time: 12.743 s/50 times when the number of sensors is 16; 10.304 s/50 times when the number of sensors is 9; and 8.644 s/50 times when the number of sensors is 4. The analysis of the OMO algorithm’s characteristic parameters proves the flexibility and robustness of this method. In addition, compared with other algorithms, the OMO algorithm can obtain an excellent leakage source tracing result in the application examples of 16, 9 and 4 sensors, and the accuracy exceeds the direct search algorithm, evolutionary algorithm, and other swarm intelligence algorithms.

Download Full-text

Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/662 ◽

2018 ◽

Cited By ~ 1

Author(s):

Karel Horák ◽

Branislav Bošanský ◽

Krishnendu Chatterjee

Keyword(s):

Heuristic Search ◽

Infinite Horizon ◽

Decision Processes ◽

Value Iteration ◽

Planning Under Uncertainty ◽

Total Cost ◽

Markov Decision ◽

Standard Models ◽

Target States ◽

Partially Observable

Partially observable Markov decision processes (POMDPs) are the standard models for planning under uncertainty with both finite and infinite horizon. Besides the well-known discounted-sum objective, indefinite-horizon objective (aka Goal-POMDPs) is another classical objective for POMDPs. In this case, given a set of target states and a positive cost for each transition, the optimization objective is to minimize the expected total cost until a target state is reached. In the literature, RTDP-Bel or heuristic search value iteration (HSVI) have been used for solving Goal-POMDPs. Neither of these algorithms has theoretical convergence guarantees, and HSVI may even fail to terminate its trials. We give the following contributions: (1) We discuss the challenges introduced in Goal-POMDPs and illustrate how they prevent the original HSVI from converging. (2) We present a novel algorithm inspired by HSVI, termed Goal-HSVI, and show that our algorithm has convergence guarantees. (3) We show that Goal-HSVI outperforms RTDP-Bel on a set of well-known examples.

Download Full-text

Importance sampling for online planning under uncertainty

The International Journal of Robotics Research ◽

10.1177/0278364918780322 ◽

2018 ◽

Vol 38 (2-3) ◽

pp. 162-181 ◽

Cited By ~ 2

Author(s):

Yuanfu Luo ◽

Haoyu Bai ◽

David Hsu ◽

Wee Sun Lee

Keyword(s):

Importance Sampling ◽

Autonomous Vehicles ◽

State Of The Art ◽

Monte Carlo Sampling ◽

Planning Under Uncertainty ◽

Online Planning ◽

Markov Decision ◽

Partially Observable ◽

Robotic Tasks ◽

General Method

The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves DESPOT’s performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks. We also present a general method for learning the importance sampling distribution.

Download Full-text

A Monte-Carlo AIXI Approximation

Journal of Artificial Intelligence Research ◽

10.1613/jair.3125 ◽

2011 ◽

Vol 40 ◽

pp. 95-142 ◽

Cited By ~ 46

Author(s):

J. Veness ◽

K.S. Ng ◽

M. Hutter ◽

W. Uther ◽

D. Silver

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Search Algorithm ◽

Future Research ◽

Monte Carlo Tree Search ◽

Practical Algorithms ◽

Learning Agent ◽

Context Tree ◽

Direct Approximation ◽

Partially Observable

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

Download Full-text