scholarly journals Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test

Aerospace ◽  
2021 ◽  
Vol 8 (9) ◽  
pp. 258
Author(s):  
Daichi Wada ◽  
Sergio A. Araujo-Estrada ◽  
Shane Windsor

Nonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated the application of deep reinforcement learning to the pitch control of a UAV in wind tunnel tests, with a particular focus of investigating the effect of time delays on flight controller performance. Multiple neural networks were trained in simulation with different assumed time delays and then wind tunnel tested. The neural networks trained with shorter delays tended to be susceptible to delay in the real tests and produce fluctuating behaviour. The neural networks trained with longer delays behaved more conservatively and did not produce oscillations but suffered steady state errors under some conditions due to unmodeled frictional effects. These results highlight the importance of performing physical experiments to validate controller performance and how the training approach used with reinforcement learning needs to be robust to reality gaps between simulation and the real world.

Author(s):  
Masumi Ishikawa ◽  

Studies on rule extraction using neural networks have exclusively adopted supervised learning, in which correct outputs are always given as training samples. The real world, however, does not always provide correct answers. We advocate the use of learning with an immediate critic, which is simple reinforcement learning. It uses an immediate binary reinforcement signal indicating whether or not an output is correct. This, of course, makes learning more difficult and time-consuming than supervised learning. Learning with an immediate critic alone, however, is not powerful enough in extracting rules from data because distributed representation emerges just as in back propagation learning. We propose to combine learning with an immediate critic and structural learning with forgetting (SLF) - structural learning with an immediate critic and forgetting (SLCF). A procedure of rule extraction from data by SLCF is similar to that by SLF. Applications of the proposed method to rule extraction from lenses data demonstrate its effectiveness.


Aerospace ◽  
2021 ◽  
Vol 8 (1) ◽  
pp. 18
Author(s):  
Daichi Wada ◽  
Sergio A. Araujo-Estrada ◽  
Shane Windsor

Deep reinforcement learning is a promising method for training a nonlinear attitude controller for fixed-wing unmanned aerial vehicles. Until now, proof-of-concept studies have demonstrated successful attitude control in simulation. However, detailed experimental investigations have not yet been conducted. This study applied deep reinforcement learning for one-degree-of-freedom pitch control in wind tunnel tests with the aim of gaining practical understandings of attitude control application. Three controllers with different discrete action choices, that is, elevator angles, were designed. The controllers with larger action rates exhibited better performance in terms of following angle-of-attack commands. The root mean square errors for tracking angle-of-attack commands decreased from 3.42° to 1.99° as the maximum action rate increased from 10°/s to 50°/s. The comparison between experimental and simulation results showed that the controller with a smaller action rate experienced the friction effect, and the controllers with larger action rates experienced fluctuating behaviors in elevator maneuvers owing to delay. The investigation of the effect of friction and delay on pitch control highlighted the importance of conducting experiments to understand actual control performances, specifically when the controllers were trained with a low-fidelity model.


2021 ◽  
pp. 027836492098785
Author(s):  
Julian Ibarz ◽  
Jie Tan ◽  
Chelsea Finn ◽  
Mrinal Kalakrishnan ◽  
Peter Pastor ◽  
...  

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.


2012 ◽  
Vol 151 ◽  
pp. 498-502
Author(s):  
Jin Xue Zhang ◽  
Hai Zhu Pan

This paper is concerned with Q-learning , a very popular algorithm for reinforcement learning ,for obstacle avoidance through neural networks. The principle tells that the focus always must be on both ecological nice tasks and behaviours when designing on robot. Many robot systems have used behavior-based systems since the 1980’s.In this paper, the Khepera robot is trained through the proposed algorithm of Q-learning using the neural networks for the task of obstacle avoidance. In experiments with real and simulated robots, the neural networks approach can be used to make it possible for Q-learning to handle changes in the environment.


Author(s):  
Christoph Jessing ◽  
Daniel Stoll ◽  
Timo Kuthada ◽  
Jochen Wiedemann

Vehicle aerodynamics and wind tunnel technology are progressing towards more realistic simulations of the real-world on-road environment. This paper presents an overview of the new systems which were implemented during the recent wind tunnel upgrade at Forschungsinstitut für Kraftfahrwesenund Fahrzeugmotoren Stuttgart as well as comparable computational fluid dynamics simulations. The fully interchangeable road simulation system features an interchangeable five-belt system and three-belt system in the same full-scale automotive wind tunnel. This system offers the efficiency of a five-belt system combined with the more sophisticated ground simulation technique of a wide belt system, which is necessary to assess the aerodynamic properties of sports cars and racing cars. In order to simulate on-road wind conditions, a side-wind generator can be installed to generate a turbulent flow field in the wind tunnel test section. It could be shown that the commonly determined drag coefficient at 0° yaw angle in the smooth flow environment of today’s wind tunnels is not representative of the drag found in real on-road wind conditions. Additionally, the investigations in unsteady side-wind conditions indicate that the commonly used approach to determine the side-wind sensitivity of a vehicle underestimates the forces occurring in turbulent flow conditions. A validated simulation model is presented. The simulation results are in good agreement with the experimental results and can be used as a complementary tool when assessing the unsteady aerodynamic behaviour of a vehicle; this behaviour can be coupled to a vehicle dynamics model for virtual road testing in the Stuttgart full-motion driving simulator. The unsteady-behaviour effects can be evaluated comprehensively, and the results allow a subjective assessment of the unsteady response of the vehicle. Furthermore, the aeroacoustic wind noise in on-road wind conditions is investigated during the development of the vehicle. The side-wind generator reproduces the natural stochastic cross-wind and allows the effect of these wind conditions to be investigated in the aeroacoustic wind tunnel. The results show similar ratings to those in on-road tests when compared with subjective listening tests. In summary, the techniques introduced open up new horizons in the field of vehicle aerodynamics and aeroacoustics, which are a step closer towards real-world conditions in automotive engineering.


Author(s):  
Anibal Pedraza ◽  
Oscar Deniz ◽  
Gloria Bueno

AbstractThe phenomenon of Adversarial Examples has become one of the most intriguing topics associated to deep learning. The so-called adversarial attacks have the ability to fool deep neural networks with inappreciable perturbations. While the effect is striking, it has been suggested that such carefully selected injected noise does not necessarily appear in real-world scenarios. In contrast to this, some authors have looked for ways to generate adversarial noise in physical scenarios (traffic signs, shirts, etc.), thus showing that attackers can indeed fool the networks. In this paper we go beyond that and show that adversarial examples also appear in the real-world without any attacker or maliciously selected noise involved. We show this by using images from tasks related to microscopy and also general object recognition with the well-known ImageNet dataset. A comparison between these natural and the artificially generated adversarial examples is performed using distance metrics and image quality metrics. We also show that the natural adversarial examples are in fact at a higher distance from the originals that in the case of artificially generated adversarial examples.


Author(s):  
Jiakai Wang

Although deep neural networks (DNNs) have already made fairly high achievements and a very wide range of impact, their vulnerability attracts lots of interest of researchers towards related studies about artificial intelligence (AI) safety and robustness this year. A series of works reveals that the current DNNs are always misled by elaborately designed adversarial examples. And unfortunately, this peculiarity also affects real-world AI applications and places them at potential risk. we are more interested in physical attacks due to their implementability in the real world. The study of physical attacks can effectively promote the application of AI techniques, which is of great significance to the security development of AI.


2021 ◽  
pp. 1-16
Author(s):  
Bing Yu ◽  
Hua Qi ◽  
Guo Qing ◽  
Felix Juefei-Xu ◽  
Xiaofei Xie ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
M. Funk Drechsler ◽  
T. A. Fiorentin ◽  
H. Göllinger

The use of actor-critic algorithms can improve the controllers currently implemented in automotive applications. This method combines reinforcement learning (RL) and neural networks to achieve the possibility of controlling nonlinear systems with real-time capabilities. Actor-critic algorithms were already applied with success in different controllers including autonomous driving, antilock braking system (ABS), and electronic stability control (ESC). However, in the current researches, virtual environments are implemented for the training process instead of using real plants to obtain the datasets. This limitation is given by trial and error methods implemented for the training process, which generates considerable risks in case the controller directly acts on the real plant. In this way, the present research proposes and evaluates an open-loop training process, which permits the data acquisition without the control interaction and an open-loop training of the neural networks. The performance of the trained controllers is evaluated by a design of experiments (DOE) to understand how it is affected by the generated dataset. The results present a successful application of open-loop training architecture. The controller can maintain the slip ratio under adequate levels during maneuvers on different floors, including grounds that are not applied during the training process. The actor neural network is also able to identify the different floors and change the acceleration profile according to the characteristics of each ground.


Sign in / Sign up

Export Citation Format

Share Document