scholarly journals Learning to Plan from Raw Data in Grid-based Games

10.29007/s8jk ◽  
2018 ◽  
Author(s):  
Andrea Dittadi ◽  
Thomas Bolander ◽  
Ole Winther

An agent that autonomously learns to act in its environment must acquire a model of the domain dynamics. This can be a challenging task, especially in real-world domains, where observations are high-dimensional and noisy. Although in automated planning the dynamics are typically given, there are action schema learning approaches that learn sym- bolic rules (e.g. STRIPS or PDDL) to be used by traditional planners. However, these algorithms rely on logical descriptions of environment observations. In contrast, recent methods in deep reinforcement learning for games learn from pixel observations. However, they typically do not acquire an environment model, but a policy for one-step action selec- tion. Even when a model is learned, it cannot generalize to unseen instances of the training domain. Here we propose a neural network-based method that learns from visual obser- vations an approximate, compact, implicit representation of the domain dynamics, which can be used for planning with standard search algorithms, and generalizes to novel domain instances. The learned model is composed of submodules, each implicitly representing an action schema in the traditional sense. We evaluate our approach on visual versions of the standard domain Sokoban, and show that, by training on one single instance, it learns a transition model that can be successfully used to solve new levels of the game.

2016 ◽  
Author(s):  
Georg Tanzmeister

This dissertation is focused on the environment model for automated vehicles. A reliable model of the local environment available in real-time is a prerequisite to enable almost any useful ­activity performed by a robot, such as planning motions to fulfill tasks. It is particularly important in safety critical applications, such as for autonomous vehicles in regular traffic. In this thesis, novel concepts for local mapping, tracking, the detection of principal moving directions, cost evaluations in motion planning, and road course estimation have been developed. An object- and sensor-independent grid representation forms the basis of all presented methods enabling a generic and robust estimation of the environment. All approaches have been evaluated with sensor data from real road scenarios, and their performance has been experimentally demonstrated with a test vehicle. ...


2020 ◽  
Vol 34 (04) ◽  
pp. 3801-3808
Author(s):  
Pierluca D'Oro ◽  
Alberto Maria Metelli ◽  
Andrea Tirinzoni ◽  
Matteo Papini ◽  
Marcello Restelli

Traditional model-based reinforcement learning approaches learn a model of the environment dynamics without explicitly considering how it will be used by the agent. In the presence of misspecified model classes, this can lead to poor estimates, as some relevant available information is ignored. In this paper, we introduce a novel model-based policy search approach that exploits the knowledge of the current agent policy to learn an approximate transition model, focusing on the portions of the environment that are most relevant for policy improvement. We leverage a weighting scheme, derived from the minimization of the error on the model-based policy gradient estimator, in order to define a suitable objective function that is optimized for learning the approximate transition model. Then, we integrate this procedure into a batch policy improvement algorithm, named Gradient-Aware Model-based Policy Search (GAMPS), which iteratively learns a transition model and uses it, together with the collected trajectories, to compute the new policy parameters. Finally, we empirically validate GAMPS on benchmark domains analyzing and discussing its properties.


2018 ◽  
Vol 11 (1) ◽  
pp. 1 ◽  
Author(s):  
Florian Rançon ◽  
Lionel Bombrun ◽  
Barna Keresztes ◽  
Christian Germain

Grapevine wood fungal diseases such as esca are among the biggest threats in vineyards nowadays. The lack of very efficient preventive (best results using commercial products report 20% efficiency) and curative means induces huge economic losses. The study presented in this paper is centered around the in-field detection of foliar esca symptoms during summer, exhibiting a typical “striped” pattern. Indeed, in-field disease detection has shown great potential for commercial applications and has been successfully used for other agricultural needs such as yield estimation. Differentiation with foliar symptoms caused by other diseases or abiotic stresses was also considered. Two vineyards from the Bordeaux region (France, Aquitaine) were chosen as the basis for the experiment. Pictures of diseased and healthy vine plants were acquired during summer 2017 and labeled at the leaf scale, resulting in a patch database of around 6000 images (224 × 224 pixels) divided into red cultivar and white cultivar samples. Then, we tackled the classification part of the problem comparing state-of-the-art SIFT encoding and pre-trained deep learning feature extractors for the classification of database patches. In the best case, 91% overall accuracy was obtained using deep features extracted from MobileNet network trained on ImageNet database, demonstrating the efficiency of simple transfer learning approaches without the need to design an ad-hoc specific feature extractor. The third part aimed at disease detection (using bounding boxes) within full plant images. For this purpose, we integrated the deep learning base network within a “one-step” detection network (RetinaNet), allowing us to perform detection queries in real time (approximately six frames per second on GPU). Recall/Precision (RP) and Average Precision (AP) metrics then allowed us to evaluate the performance of the network on a 91-image (plants) validation database. Overall, 90% precision for a 40% recall was obtained while best esca AP was about 70%. Good correlation between annotated and detected symptomatic surface per plant was also obtained, meaning slightly symptomatic plants can be efficiently separated from severely attacked plants.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 744
Author(s):  
Jorge Godoy ◽  
Víctor Jiménez ◽  
Antonio Artuñedo ◽  
Jorge Villagra

Today, perception solutions for Automated Vehicles rely on sensors on board the vehicle, which are limited by the line of sight and occlusions caused by any other elements on the road. As an alternative, Vehicle-to-Everything (V2X) communications allow vehicles to cooperate and enhance their perception capabilities. Besides announcing its own presence and intentions, services such as Collective Perception (CPS) aim to share information about perceived objects as a high-level description. This work proposes a perception framework for fusing information from on-board sensors and data received via CPS messages (CPM). To that end, the environment is modeled using an occupancy grid where occupied, and free and uncertain space is considered. For each sensor, including V2X, independent grids are calculated from sensor measurements and uncertainties and then fused in terms of both occupancy and confidence. Moreover, the implementation of a Particle Filter allows the evolution of cell occupancy from one step to the next, allowing for object tracking. The proposed framework was validated on a set of experiments using real vehicles and infrastructure sensors for sensing static and dynamic objects. Results showed a good performance even under important uncertainties and delays, hence validating the viability of the proposed framework for Collective Perception.


2021 ◽  
Vol 16 (1) ◽  
pp. 1-20
Author(s):  
Lichen Wang ◽  
Zhengming Ding ◽  
Yun Fu

Multi-label learning recovers multiple labels from a single instance. It is a more challenging task compared with single-label manner. Most multi-label learning approaches need large-scale well-labeled samples to achieve high accurate performance. However, it is expensive to build such a dataset. In this work, we propose a generic multi-label learning framework based on Adaptive Graph and Marginalized Augmentation (AGMA) in a semi-supervised scenario. Generally speaking, AGMA makes use of a small amount of labeled data associated with a lot of unlabeled data to boost the learning performance. First, an adaptive similarity graph is learned to effectively capture the intrinsic structure within the data. Second, marginalized augmentation strategy is explored to enhance the model generalization and robustness. Third, a feature-label autoencoder is further deployed to improve inferring efficiency. All the modules are jointly trained to benefit each other. State-of-the-art benchmarks in both traditional and zero-shot multi-label learning scenarios are evaluated. Experiments and ablation studies illustrate the accuracy and efficiency of our AGMA method.


Author(s):  
R.P. Goehner ◽  
W.T. Hatfield ◽  
Prakash Rao

Computer programs are now available in various laboratories for the indexing and simulation of transmission electron diffraction patterns. Although these programs address themselves to the solution of various aspects of the indexing and simulation process, the ultimate goal is to perform real time diffraction pattern analysis directly off of the imaging screen of the transmission electron microscope. The program to be described in this paper represents one step prior to real time analysis. It involves the combination of two programs, described in an earlier paper(l), into a single program for use on an interactive basis with a minicomputer. In our case, the minicomputer is an INTERDATA 70 equipped with a Tektronix 4010-1 graphical display terminal and hard copy unit.A simplified flow diagram of the combined program, written in Fortran IV, is shown in Figure 1. It consists of two programs INDEX and TEDP which index and simulate electron diffraction patterns respectively. The user has the option of choosing either the indexing or simulating aspects of the combined program.


2006 ◽  
Vol 73 ◽  
pp. 85-96 ◽  
Author(s):  
Richard J. Reece ◽  
Laila Beynon ◽  
Stacey Holden ◽  
Amanda D. Hughes ◽  
Karine Rébora ◽  
...  

The recognition of changes in environmental conditions, and the ability to adapt to these changes, is essential for the viability of cells. There are numerous well characterized systems by which the presence or absence of an individual metabolite may be recognized by a cell. However, the recognition of a metabolite is just one step in a process that often results in changes in the expression of whole sets of genes required to respond to that metabolite. In higher eukaryotes, the signalling pathway between metabolite recognition and transcriptional control can be complex. Recent evidence from the relatively simple eukaryote yeast suggests that complex signalling pathways may be circumvented through the direct interaction between individual metabolites and regulators of RNA polymerase II-mediated transcription. Biochemical and structural analyses are beginning to unravel these elegant genetic control elements.


2010 ◽  
Vol 43 (18) ◽  
pp. 16
Author(s):  
MATTHEW R.G. TAYLOR
Keyword(s):  

2007 ◽  
Vol 0 (0) ◽  
pp. 0-0
Author(s):  
C.W. Kim ◽  
Y.H. Kim ◽  
H.G. Cha ◽  
D.K. Lee ◽  
Y.S. Kang

Sign in / Sign up

Export Citation Format

Share Document