scholarly journals HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment

Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 719
Author(s):  
Lina Lu ◽  
Wanpeng Zhang ◽  
Xueqiang Gu ◽  
Xiang Ji ◽  
Jing Chen

The Monte Carlo Tree Search (MCTS) has demonstrated excellent performance in solving many planning problems. However, the state space and the branching factors are huge, and the planning horizon is long in many practical applications, especially in the adversarial environment. It is computationally expensive to cover a sufficient number of rewarded states that are far away from the root in the flat non-hierarchical MCTS. Therefore, the flat non-hierarchical MCTS is inefficient for dealing with planning problems with a long planning horizon, huge state space, and branching factors. In this work, we propose a novel hierarchical MCTS-based online planning method named the HMCTS-OP to tackle this issue. The HMCTS-OP integrates the MAXQ-based task hierarchies and the hierarchical MCTS algorithms into the online planning framework. Specifically, the MAXQ-based task hierarchies reduce the search space and guide the search process. Therefore, the computational complexity is significantly reduced. Moreover, the reduction in the computational complexity enables the MCTS to perform a deeper search to find better action in a limited time. We evaluate the performance of the HMCTS-OP in the domain of online planning in the asymmetric adversarial environment. The experiment results show that the HMCTS-OP outperforms other online planning methods in this domain.

Author(s):  
Xiaomin Li ◽  
Subbarao Kambhampati ◽  
Jami Shah

Abstract The limited success and acceptance of automated process planning methods in the industry can be traced to the fact that most existing approaches aim at complete automation. We believe that the quest for complete automation is flawed, both because in practice optimality metrics for process plans are context-sensitive, and because there is significant organizational resistance to approaches that completely eliminate humans from the process planning framework. In this paper, we present an interactive and iterative planning framework, called ASUPPA, which focuses instead on providing intelligent assistance to a human process planner. After generating a “good” default process plan, ASUPPA engages in a “present – elicit criticism – revise” loop with an expert process planner. To operate successfully, ASUPPA needs access to the full search space of process plans, and have the ability to incrementally modify plans in response to expert criticism. The former is provided by basing ASUPPA on ASU Features Testbed, a comprehensive and systematic framework for recognizing and reasoning with features in machinable parts. To support the latter, the system is equipped with an iterative and interactive search mechanism. We will discuss the operational details of the resultant system, called ASUPPA.


Author(s):  
Run-de Zhang ◽  
Wei-wei Cai ◽  
Le-ping Yang ◽  
Cheng Si

The spacecraft relative motion trajectory planning is one of the enabling techniques for autonomous proximity operations, especially in the increasingly complicated mission environments. Most traditional trajectory planning methods focus on improving the performance criteria in the deterministic conditions, whereas various uncertain elements in practice would significantly degrade the trajectory performance. Considering the uncertainties underlying the collision avoidance constraints, this paper suggests a model predictive control based online trajectory planning framework in which the obstacle information in higher-precision would be consistently updated by the onboard sensor. To improve the computational efficiency of the online planning framework, the rotating hyperplane (RH) technique is utilized to transform the nonlinear ellipsoidal keep-out zone constraints into convex formulations. And the concept of rotation window is introduced to eliminate the unexpected mismatch between the spacecraft motion and hyperplane rotation in the conventional RH method, which in sequence improves the RH method’s capability for multiple obstacle avoidance problem. Moreover, a three-dimensional (3-D) extension strategy is proposed to simplify the computation procedure when applying the RH method for a 3-D collision avoidance problem. Numerical simulations are carried out to validate the performance of the proposed online trajectory planning framework in addressing the uncertain collision avoidance constraints.


Computation ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 16
Author(s):  
George Tsakalidis ◽  
Kostas Georgoulakos ◽  
Dimitris Paganias ◽  
Kostas Vergidis

Business process optimization (BPO) has become an increasingly attractive subject in the wider area of business process intelligence and is considered as the problem of composing feasible business process designs with optimal attribute values, such as execution time and cost. Despite the fact that many approaches have produced promising results regarding the enhancement of attribute performance, little has been done to reduce the computational complexity due to the size of the problem. The proposed approach introduces an elaborate preprocessing phase as a component to an established optimization framework (bpoF) that applies evolutionary multi-objective optimization algorithms (EMOAs) to generate a series of diverse optimized business process designs based on specific process requirements. The preprocessing phase follows a systematic rule-based algorithmic procedure for reducing the library size of candidate tasks. The experimental results on synthetic data demonstrate a considerable reduction of the library size and a positive influence on the performance of EMOAs, which is expressed with the generation of an increasing number of nondominated solutions. An important feature of the proposed phase is that the preprocessing effects are explicitly measured before the EMOAs application; thus, the effects on the library reduction size are directly correlated with the improved performance of the EMOAs in terms of average time of execution and nondominated solution generation. The work presented in this paper intends to pave the way for addressing the abiding optimization challenges related to the computational complexity of the search space of the optimization problem by working on the problem specification at an earlier stage.


2014 ◽  
Vol 513-517 ◽  
pp. 1092-1095
Author(s):  
Bo Wu ◽  
Yan Peng Feng ◽  
Hong Yan Zheng

Bayesian reinforcement learning has turned out to be an effective solution to the optimal tradeoff between exploration and exploitation. However, in practical applications, the learning parameters with exponential growth are the main impediment for online planning and learning. To overcome this problem, we bring factored representations, model-based learning, and Bayesian reinforcement learning together in a new approach. Firstly, we exploit a factored representation to describe the states to reduce the size of learning parameters, and adopt Bayesian inference method to learn the unknown structure and parameters simultaneously. Then, we use an online point-based value iteration algorithm to plan and learn. The experimental results show that the proposed approach is an effective way for improving the learning efficiency in large-scale state spaces.


2021 ◽  
Vol 11 (3) ◽  
pp. 1013
Author(s):  
Zvezdan Lončarević ◽  
Rok Pahič ◽  
Aleš Ude ◽  
Andrej Gams

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.


Author(s):  
Nancy Fulda ◽  
Daniel Ricks ◽  
Ben Murdoch ◽  
David Wingate

Autonomous agents must often detect affordances: the set of behaviors enabled by a situation. Affordance extraction is particularly helpful in domains with large action spaces, allowing the agent to prune its search space by avoiding futile behaviors. This paper presents a method for affordance extraction via word embeddings trained on a tagged Wikipedia corpus. The resulting word vectors are treated as a common knowledge database which can be queried using linear algebra. We apply this method to a reinforcement learning agent in a text-only environment and show that affordance-based action selection improves performance in most cases. Our method increases the computational complexity of each learning step but significantly reduces the total number of steps needed. In addition, the agent's action selections begin to resemble those a human would choose.


1998 ◽  
Vol 9 ◽  
pp. 1-36 ◽  
Author(s):  
M. L. Littman ◽  
J. Goldsmith ◽  
M. Mundhenk

We examine the computational complexity of testing and finding small plans in probabilistic planning domains with both flat and propositional representations. The complexity of plan evaluation and existence varies with the plan type sought; we examine totally ordered plans, acyclic plans, and looping plans, and partially ordered plans under three natural definitions of plan value. We show that problems of interest are complete for a variety of complexity classes: PL, P, NP, co-NP, PP, NP^PP, co-NP^PP, and PSPACE. In the process of proving that certain planning problems are complete for NP^PP, we introduce a new basic NP^PP-complete problem, E-MAJSAT, which generalizes the standard Boolean satisfiability problem to computations involving probabilistic quantities; our results suggest that the development of good heuristics for E-MAJSAT could be important for the creation of efficient algorithms for a wide variety of problems.


2021 ◽  
Vol 27 (11) ◽  
pp. 563-574
Author(s):  
V. V. Kureychik ◽  
◽  
S. I. Rodzin ◽  

Computational models of bio heuristics based on physical and cognitive processes are presented. Data on such characteristics of bio heuristics (including evolutionary and swarm bio heuristics) are compared.) such as the rate of convergence, computational complexity, the required amount of memory, the configuration of the algorithm parameters, the difficulties of software implementation. The balance between the convergence rate of bio heuristics and the diversification of the search space for solutions to optimization problems is estimated. Experimental results are presented for the problem of placing Peco graphs in a lattice with the minimum total length of the graph edges.


Author(s):  
Stephen M. Majercik

Stochastic satisfiability (SSAT) is an extension of satisfiability (SAT) that merges two important areas of artificial intelligence: logic and probabilistic reasoning. Initially suggested by Papadimitriou, who called it a “game against nature”, SSAT is interesting both from a theoretical perspective–it is complete for PSPACE, an important complexity class–and from a practical perspective–a broad class of probabilistic planning problems can be encoded and solved as SSAT instances. This chapter describes SSAT and its variants, their computational complexity, applications of SSAT, analytical results, algorithms and empirical results, related work, and directions for future work.


1979 ◽  
Vol 10 (4) ◽  
pp. 131-135
Author(s):  
J. T. Meij

In this paper a linear programming (LP) model for aggregate production planning is given. This is a general model that can be used in various production situations. It optimizes the monthly planning of human resources, production quantities and inventories on the medium term (e.g. a 12 month planning horizon) for a multi-department, multi-product production facility. A computer programme was developed for the model, making use of a standard LP package. In practical applications savings of up to 33% of variable cost were obtained.In hierdie artikel word 'n lineere programmeringsmodel (LP) vir aggregaat-produksiebeplanning gegee. Dit is 'n algemene model wat in 'n verskeidenheid van produksiesituasies gebruik kan word. Dit optimiseer die maandelikse beplanning van mannekrag, produksiehoeveelhede, en voorrade op die mediumtermyn (bv. 'n beplanningshorison van 12 maande) vir 'n multi-departementele, multi-produk-produksiefasiliteit. 'n Rekenaarprogram is vir die model ontwikkel, wat van 'n standaard LP-pakket gebruik maak. In praktiese toepassings is besparings van tot 33% van veranderlike koste verkry.


Sign in / Sign up

Export Citation Format

Share Document