Monte Carlo Sampling Methods for Approximating Interactive POMDPs

Journal of Artificial Intelligence Research ◽

10.1613/jair.2630 ◽

2009 ◽

Vol 34 ◽

pp. 297-337 ◽

Cited By ~ 9

Author(s):

P. Doshi ◽

P. J Gmytrasiewicz

Keyword(s):

Particle Filtering ◽

Belief Systems ◽

Single Agent ◽

Approximate Solutions ◽

Monte Carlo Sampling ◽

Physical World ◽

Space Complexity ◽

Policy Space ◽

Complementary Method ◽

Look Ahead

Partially observable Markov decision processes (POMDPs) provide a principled framework for sequential planning in uncertain single agent settings. An extension of POMDPs to multiagent settings, called interactive POMDPs (I-POMDPs), replaces POMDP belief spaces with interactive hierarchical belief systems which represent an agents belief about the physical world, about beliefs of other agents, and about their beliefs about others beliefs. This modification makes the difficulties of obtaining solutions due to complexity of the belief and policy spaces even more acute. We describe a general method for obtaining approximate solutions of I-POMDPs based on particle filtering (PF). We introduce the interactive PF, which descends the levels of the interactive belief hierarchies and samples and propagates beliefs at each level. The interactive PF is able to mitigate the belief space complexity, but it does not address the policy space complexity. To mitigate the policy space complexity sometimes also called the curse of history we utilize a complementary method based on sampling likely observations while building the look ahead reachability tree. While this approach does not completely address the curse of history, it beats back the curses impact substantially. We provide experimental results and chart future work.

Download Full-text

Multi-agent reinforcement learning using ordinal action selection and approximate policy iteration

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691316500533 ◽

2016 ◽

Vol 14 (06) ◽

pp. 1650053

Author(s):

Daxue Liu ◽

Jun Wu ◽

Xin Xu

Keyword(s):

Reinforcement Learning ◽

Single Agent ◽

Action Selection ◽

Policy Iteration ◽

Common Interest ◽

Policy Space ◽

Markov Games ◽

Approximate Policy Iteration ◽

Multi Agent ◽

Agent Coordination

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.

Download Full-text

Efficient estimation and ensemble generation in climate modelling

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2007.2067 ◽

2007 ◽

Vol 365 (1857) ◽

pp. 2077-2088 ◽

Cited By ~ 27

Author(s):

J.D Annan ◽

J.C Hargreaves

Keyword(s):

Monte Carlo ◽

Particle Filtering ◽

General Circulation ◽

Climate Models ◽

General Circulation Models ◽

Monte Carlo Sampling ◽

Efficient Estimation ◽

Future Research ◽

Great Promise ◽

Climate Modelling

In this paper, we review progress towards efficiently estimating parameters in climate models. Since the general problem is inherently intractable, a range of approximations and heuristic methods have been proposed. Simple Monte Carlo sampling methods, although easy to implement and very flexible, are rather inefficient, making implementation possible only in the very simplest models. More sophisticated methods based on random walks and gradient-descent methods can provide more efficient solutions, but it is often unclear how to extract probabilistic information from such methods and the computational costs are still generally too high for their application to state-of-the-art general circulation models (GCMs). The ensemble Kalman filter is an efficient Monte Carlo approximation which is optimal for linear problems, but we show here how its accuracy can degrade in nonlinear applications. Methods based on particle filtering may provide a solution to this problem but have yet to be studied in any detail in the realm of climate models. Statistical emulators show great promise for future research and their computational speed would eliminate much of the need for efficient sampling techniques. However, emulation of a full GCM has yet to be achieved and the construction of such represents a substantial computational task in itself.

Download Full-text

Science and Certainty

10.1071/9780643095311 ◽

2007 ◽

Author(s):

John TO Kirk

Keyword(s):

World View ◽

Belief Systems ◽

Popular Science ◽

Physical World ◽

Human Existence ◽

Scientific World ◽

Biological Reality ◽

Special Part ◽

Philosophical Background ◽

The Many

How did the cosmos, and our own special part of it, come to be? How did life emerge and how did we arise within it? What can we say about the essential nature of the physical world? What can be said about the physical basis of consciousness? What can science tell or not tell us about the nature and origin of physical and biological reality? Science and Certainty clears away the many misunderstandings surrounding these questions. The book addresses why certain areas of science cause concern to many people today – in particular, those which seem to have implications for the meaning of human existence, and for our significance on this planet and in the universe as a whole. It also examines the tension that can exist between scientific and religious belief systems. Science and Certainty offers an account of what science does, in fact, ask us to believe about the most fundamental aspects of reality and, therefore, the implications of accepting the scientific world view. The author also includes a historical and philosophical background to a number of environmental issues and argues that it is only through science that we can hope to solve these problems. This book will appeal to popular science readers, those with an interest in the environment and the implications of science for the meaning of human existence, as well as students of environmental studies, philosophy, ethics and theology.

Download Full-text

A Novel Particle Filtering Framework Using Genetic Monte Carlo Sampling

2009 International Conference on Management and Service Science ◽

10.1109/icmss.2009.5305831 ◽

2009 ◽

Author(s):

Long Ye ◽

Jingling Wang ◽

Chuanzhen Li ◽

Hui Wang ◽

Qin Zhang

Keyword(s):

Monte Carlo ◽

Particle Filtering ◽

Monte Carlo Sampling

Download Full-text

Probabilistic Prognosis with Dynamic Bayesian Networks

International Journal of Prognostics and Health Management ◽

10.36001/ijphm.2015.v6i4.2290 ◽

2020 ◽

Vol 6 (4) ◽

Author(s):

Gregory Bartram ◽

Sankaran Mahadevan

Keyword(s):

Bayesian Networks ◽

Particle Filtering ◽

Dynamic Bayesian Network ◽

Monte Carlo Sampling ◽

Dynamic Bayesian Networks ◽

Hydraulic Actuator ◽

Current Estimate ◽

Seamless Integration ◽

State Estimate ◽

System Variables

This paper proposes a methodology for probabilistic prognosis of a system using a dynamic Bayesian network (DBN). Dynamic Bayesian networks are suitable for probabilistic prognosis because of their ability to integrate information in a variety of formats from various sources and give a probabilistic representation of the system state. Further, DBNs provide a platform naturally suited for seamless integration of diagnosis, uncertainty quantification, and prediction. In the proposed methodology, a DBN is used for online diagnosis via particle filtering, providing a current estimate of the joint distribution over the system variables. The information available in the state estimate also helps to quantify the uncertainty in diagnosis. Next, based on this probabilistic state estimate, future states of the system are predicted using the DBN and sequential or recursive Monte Carlo sampling. Prediction in this manner provides the necessary information to estimate the distribution of remaining use life (RUL). The prognosis procedure, which is system specific, is validated using a suite of offline hierarchical metrics. The prognosis methodology is demonstrated on a hydraulic actuator subject to a progressive seal wear that results in internal leakage between the chambers of the actuator.

Download Full-text

Approximate periodic solution for the large-amplitude oscillations of a simple pendulum

International Journal of Mechanical Engineering Education ◽

10.1177/0306419019842298 ◽

2019 ◽

Vol 48 (4) ◽

pp. 335-350 ◽

Cited By ~ 4

Author(s):

Akuro Big-Alabo

Keyword(s):

Large Amplitude ◽

Entire Range ◽

Solution Method ◽

Approximate Solutions ◽

Linearization Method ◽

Piecewise Linearization ◽

Simple Pendulum ◽

Complementary Method ◽

Approximate Analytical Solutions ◽

Present Solution

This paper presents approximate periodic solutions to the anharmonic (i.e. not harmonic or non-sinusoidal) response of a simple pendulum undergoing moderate- to large-amplitude oscillations. The approximate solutions were derived by using a modified continuous piecewise linearization method that enabled very accurate solutions to the pendulum oscillations for the entire range of possible amplitudes i.e. [Formula: see text]. The present solution method is very simple and can be used to obtain amplitude-frequency solutions as well as the displacement and velocity histories of the simple pendulum without the need for a complementary method. The purpose of this paper is to present simple and accurate approximate analytical solutions to the large-amplitude oscillations of the simple pendulum that can be applied by undergraduates.

Download Full-text

Posterior Elimination Fast Look-Ahead Rao-Blackwellized Particle Filtering for Simultaneous Localization and Mapping

Procedia Computer Science ◽

10.1016/j.procs.2016.05.114 ◽

2016 ◽

Vol 86 ◽

pp. 261-264 ◽

Cited By ~ 1

Author(s):

Surasak Nasuriwong ◽

Peerapol Yuvapoositanon

Keyword(s):

Particle Filtering ◽

Simultaneous Localization And Mapping ◽

Look Ahead ◽

Localization And Mapping

Download Full-text

Nonparametric particle filtering and smoothing with quasi-Monte Carlo sampling

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2010.485315 ◽

2011 ◽

Vol 81 (11) ◽

pp. 1361-1379 ◽

Cited By ~ 2

Author(s):

Jan C. Neddermeyer

Keyword(s):

Monte Carlo ◽

Particle Filtering ◽

Monte Carlo Sampling ◽

Quasi Monte Carlo

Download Full-text

Communication-Based Decomposition Mechanisms for Decentralized MDPs

Journal of Artificial Intelligence Research ◽

10.1613/jair.2466 ◽

2008 ◽

Vol 32 ◽

pp. 169-202 ◽

Cited By ~ 9

Author(s):

C. V. Goldman ◽

S. Zilberstein

Keyword(s):

Search Algorithm ◽

Single Agent ◽

Real Life ◽

Optimal Solution ◽

Approximate Solutions ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Markov Decision Problem ◽

Heuristic Search Algorithm ◽

Markov Decision

Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.

Download Full-text

Towards Applying Interactive POMDPs to Real-World Adversary Modeling

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v24i2.18818 ◽

2021 ◽

Vol 24 (2) ◽

pp. 1814-1820

Author(s):

Brenda Ng ◽

Carol Meyers ◽

Kofi Boakye ◽

John Nitao

Keyword(s):

Money Laundering ◽

Real World ◽

Particle Filtering ◽

Decision Processes ◽

Value Iteration ◽

Sequential Decision ◽

Solution Quality ◽

Agent Interactions ◽

Look Ahead ◽

Markov Decision

We examine the suitability of using decision processes to model real-world systems of intelligent adversaries. Decision processes have long been used to study cooperative multiagent interactions, but their practical applicability to adversarial problems has received minimal study. We address the pros and cons of applying sequential decision-making in this area, using the crime of money laundering as a specific example. Motivated by case studies, we abstract out a model of the money laundering process, using the framework of interactive partially observable Markov decision processes (I-POMDPs). We address why this framework is well suited for modeling adversarial interactions. Particle filtering and value iteration are used to solve the model, with the application of different pruning and look-ahead strategies to assess the tradeoffs between solution quality and algorithmic run time. Our results show that there is a large gap in the level of realism that can currently be achieved by such decision models, largely due to computational demands that limit the size of problems that can be solved. While these results represent solutions to a simplified model of money laundering, they illustrate nonetheless the kinds of agent interactions that cannot be captured by standard approaches such as anomaly detection. This implies that I-POMDP methods may be valuable in the future, when algorithmic capabilities have further evolved.

Download Full-text