Quadrupedal robots trot into the wild

2020 ◽  
Vol 5 (47) ◽  
pp. eabe5218
Author(s):  
Sehoon Ha

Deep reinforcement learning enables quadruped robots to traverse challenging natural environments using only proprioception.

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Alexandre Y. Dombrovski ◽  
Beatriz Luna ◽  
Michael N. Hallquist

Abstract When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.


2015 ◽  
Vol 7 (3) ◽  
Author(s):  
Elena Garcia ◽  
Juan C. Arevalo ◽  
Manuel Cestari ◽  
Daniel Sanz-Merodio

The legged locomotion system of biological quadrupeds has proven to be the most efficient in natural, complex terrain. Particularly, horses' legs have been evolved to provide speed, endurance, and strength superior to any other animal of equal size. Quadruped robots, emulating their biological counterparts, could become the best choice for field missions in complex or natural environments; however, they should be provided with optimum performance against mobility, payload, and endurance. The design of the leg mechanism is of paramount importance to achieve the targeted performance, and in order to design a leg mechanism able to provide the robot with such agile capabilities nature is the best source for inspiration. In this work, key principles underlying horse legs' power capabilities have been extracted and translated to a biomimetic leg concept. Afterwards, a real prototype has been designed following the biomimetic concept proposed. A key element in the biomimetic concept is the multifunctionality of the natural musculotendinous system, which has been mimicked by combining series elastic actuation and passive elements. This work provides an assessment of the benefits that bio-inspired solutions can provide versus the purely engineering approaches. The experimental evaluation of the bio-inspired prototype shows an improvement on the performance compared to a leg design based on purely engineering principles.


2020 ◽  
Vol 5 (47) ◽  
pp. eabc5986 ◽  
Author(s):  
Joonho Lee ◽  
Jemin Hwangbo ◽  
Lorenz Wellhausen ◽  
Vladlen Koltun ◽  
Marco Hutter

Legged locomotion can extend the operational domain of robots to some of the most challenging environments on Earth. However, conventional controllers for legged locomotion are based on elaborate state machines that explicitly trigger the execution of motion primitives and reflexes. These designs have increased in complexity but fallen short of the generality and robustness of animal locomotion. Here, we present a robust controller for blind quadrupedal locomotion in challenging natural environments. Our approach incorporates proprioceptive feedback in locomotion control and demonstrates zero-shot generalization from simulation to natural environments. The controller is trained by reinforcement learning in simulation. The controller is driven by a neural network policy that acts on a stream of proprioceptive signals. The controller retains its robustness under conditions that were never encountered during training: deformable terrains such as mud and snow, dynamic footholds such as rubble, and overground impediments such as thick vegetation and gushing water. The presented work indicates that robust locomotion in natural environments can be achieved by training in simple domains.


2007 ◽  
Vol 19 (11) ◽  
pp. 3108-3131 ◽  
Author(s):  
André Grüning

Simple recurrent networks (SRNs) in symbolic time-series prediction (e.g., language processing models) are frequently trained with gradient descent--based learning algorithms, notably with variants of backpropagation (BP). A major drawback for the cognitive plausibility of BP is that it is a supervised scheme in which a teacher has to provide a fully specified target answer. Yet agents in natural environments often receive summary feedback about the degree of success or failure only, a view adopted in reinforcement learning schemes. In this work, we show that for SRNs in prediction tasks for which there is a probability interpretation of the network's output vector, Elman BP can be reimplemented as a reinforcement learning scheme for which the expected weight updates agree with the ones from traditional Elman BP. Network simulations on formal languages corroborate this result and show that the learning behaviors of Elman backpropagation and its reinforcement variant are very similar also in online learning tasks.


2020 ◽  
Author(s):  
Alexandre Y. Dombrovski ◽  
Beatriz Luna ◽  
Michael N. Hallquist

ABSTRACTWhen making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Using a reinforcement learning task with a spatially structured reward function, we show that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.


Author(s):  
L. P. Hardie ◽  
D. L. Balkwill ◽  
S. E. Stevens

Agmenellum quadruplicatum is a unicellular, non-nitrogen-fixing, marine cyanobacterium (blue-green alga). The ultrastructure of this organism, when grown in the laboratory with all necessary nutrients, has been characterized thoroughly. In contrast, little is known of its ultrastructure in the specific nutrient-limiting conditions typical of its natural habitat. Iron is one of the nutrients likely to limit this organism in such natural environments. It is also of great importance metabolically, being required for both photosynthesis and assimilation of nitrate. The purpose of this study was to assess the effects (if any) of iron limitation on the ultrastructure of A. quadruplicatum. It was part of a broader endeavor to elucidate the ultrastructure of cyanobacteria in natural systemsActively growing cells were placed in a growth medium containing 1% of its usual iron. The cultures were then sampled periodically for 10 days and prepared for thin sectioning TEM to assess the effects of iron limitation.


2013 ◽  
Vol 18 (1) ◽  
pp. 3-11 ◽  
Author(s):  
Emmanuel Kuntsche ◽  
Florian Labhart

Ecological Momentary Assessment (EMA) is a way of collecting data in people’s natural environments in real time and has become very popular in social and health sciences. The emergence of personal digital assistants has led to more complex and sophisticated EMA protocols but has also highlighted some important drawbacks. Modern cell phones combine the functionalities of advanced communication systems with those of a handheld computer and offer various additional features to capture and record sound, pictures, locations, and movements. Moreover, most people own a cell phone, are familiar with the different functions, and always carry it with them. This paper describes ways in which cell phones have been used for data collection purposes in the field of social sciences. This includes automated data capture techniques, for example, geolocation for the study of mobility patterns and the use of external sensors for remote health-monitoring research. The paper also describes cell phones as efficient and user-friendly tools for prompt manual data collection, that is, by asking participants to produce or to provide data. This can either be done by means of dedicated applications or by simply using the web browser. We conclude that cell phones offer a variety of advantages and have a great deal of potential for innovative research designs, suggesting they will be among the standard data collection devices for EMA in the coming years.


Decision ◽  
2016 ◽  
Vol 3 (2) ◽  
pp. 115-131 ◽  
Author(s):  
Helen Steingroever ◽  
Ruud Wetzels ◽  
Eric-Jan Wagenmakers

Sign in / Sign up

Export Citation Format

Share Document