Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization

Riccardo Polvara; Massimiliano Patacchiola; Marc Hanheide; Gerhard Neumann

doi:10.3390/robotics9010008

Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization

Robotics ◽

10.3390/robotics9010008 ◽

2020 ◽

Vol 9 (1) ◽

pp. 8 ◽

Cited By ~ 2

Author(s):

Riccardo Polvara ◽

Massimiliano Patacchiola ◽

Marc Hanheide ◽

Gerhard Neumann

Keyword(s):

State Of The Art ◽

Control Policy ◽

Divide And Conquer ◽

Level Control ◽

Autonomous Landing ◽

Aerial Vehicle ◽

Noisy Conditions ◽

High Level ◽

Technical Solutions ◽

First Time

The autonomous landing of an Unmanned Aerial Vehicle (UAV) on a marker is one of the most challenging problems in robotics. Many solutions have been proposed, with the best results achieved via customized geometric features and external sensors. This paper discusses for the first time the use of deep reinforcement learning as an end-to-end learning paradigm to find a policy for UAVs autonomous landing. Our method is based on a divide-and-conquer paradigm that splits a task into sequential sub-tasks, each one assigned to a Deep Q-Network (DQN), hence the name Sequential Deep Q-Network (SDQN). Each DQN in an SDQN is activated by an internal trigger, and it represents a component of a high-level control policy, which can navigate the UAV towards the marker. Different technical solutions have been implemented, for example combining vanilla and double DQNs, and the introduction of a partitioned buffer replay to address the problem of sample efficiency. One of the main contributions of this work consists in showing how an SDQN trained in a simulator via domain randomization, can effectively generalize to real-world scenarios of increasing complexity. The performance of SDQNs is comparable with a state-of-the-art algorithm and human pilots while being quantitatively better in noisy conditions.

Download Full-text

Autonomous Vehicular Landings on the Deck of an Unmanned Surface Vehicle using Deep Reinforcement Learning

Robotica ◽

10.1017/s0263574719000316 ◽

2019 ◽

Vol 37 (11) ◽

pp. 1867-1882 ◽

Cited By ~ 2

Author(s):

Riccardo Polvara ◽

Sanjay Sharma ◽

Jian Wan ◽

Andrew Manning ◽

Robert Sutton

Keyword(s):

Reinforcement Learning ◽

Template Matching ◽

Control Technique ◽

Autonomous Landing ◽

Unmanned Surface Vehicle ◽

Minimum Requirement ◽

Aerial Vehicle ◽

Two Phases ◽

High Level ◽

Marker Detection

SummaryAutonomous landing on the deck of a boat or an unmanned surface vehicle (USV) is the minimum requirement for increasing the autonomy of water monitoring missions. This paper introduces an end-to-end control technique based on deep reinforcement learning for landing an unmanned aerial vehicle on a visual marker located on the deck of a USV. The solution proposed consists of a hierarchy of Deep Q-Networks (DQNs) used as high-level navigation policies that address the two phases of the flight: the marker detection and the descending manoeuvre. Few technical improvements have been proposed to stabilize the learning process, such as the combination of vanilla and double DQNs, and a partitioned buffer replay. Simulated studies proved the robustness of the proposed algorithm against different perturbations acting on the marine vessel. The performances obtained are comparable with a state-of-the-art method based on template matching.

Download Full-text

On the Difficulty of FSM-based Hardware Obfuscation

IACR Transactions on Cryptographic Hardware and Embedded Systems ◽

10.46586/tches.v2018.i3.293-330 ◽

2018 ◽

pp. 293-330 ◽

Cited By ~ 4

Author(s):

Marc Fyrbiak ◽

Sebastian Wallat ◽

Jonathan Déchelotte ◽

Nils Albartus ◽

Sinan Böcker ◽

...

Keyword(s):

Reverse Engineering ◽

Integrated Circuit ◽

State Of The Art ◽

Security Analysis ◽

Third Party ◽

Level Control ◽

Finite State ◽

Field Programmable ◽

Ip Cores ◽

High Level

In today’s Integrated Circuit (IC) production chains, a designer’s valuable Intellectual Property (IP) is transparent to diverse stakeholders and thus inevitably prone to piracy. To protect against this threat, numerous defenses based on the obfuscation of a circuit’s control path, i.e. Finite State Machine (FSM), have been proposed and are commonly believed to be secure. However, the security of these sequential obfuscation schemes is doubtful since realistic capabilities of reverse engineering and subsequent manipulation are commonly neglected in the security analysis. The contribution of our work is threefold: First, we demonstrate how high-level control path information can be automatically extracted from third-party, gate-level netlists. To this end, we extend state-of-the-art reverse engineering algorithms to deal with Field Programmable Gate Array (FPGA) gate-level netlists equipped with FSM obfuscation. Second, on the basis of realistic reverse engineering capabilities we carefully review the security of state-of-the-art FSM obfuscation schemes. We reveal several generic strategies that bypass allegedly secure FSM obfuscation schemes and we practically demonstrate our attacks for a several of hardware designs, including cryptographic IP cores. Third, we present the design and implementation of Hardware Nanomites, a novel obfuscation scheme based on partial dynamic reconfiguration that generically mitigates existing algorithmic reverse engineering.

Download Full-text

Capsule Networks for Object Detection in UAV Imagery

Remote Sensing ◽

10.3390/rs11141694 ◽

2019 ◽

Vol 11 (14) ◽

pp. 1694 ◽

Cited By ~ 2

Author(s):

Mohamed Lamine Mekhalfi ◽

Mesay Belete Bejiga ◽

Davide Soresina ◽

Farid Melgani ◽

Begüm Demir

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Relative Position ◽

State Of The Art ◽

Semantic Content ◽

Computational Time ◽

Complex Object ◽

Aerial Vehicle ◽

High Level ◽

Crowded Scenes

Recent advances in Convolutional Neural Networks (CNNs) have attracted great attention in remote sensing due to their high capability to model high-level semantic content of Remote Sensing (RS) images. However, CNNs do not explicitly retain the relative position of objects in an image and, thus, the effectiveness of the obtained features is limited in the framework of the complex object detection problems. To address this problem, in this paper we introduce Capsule Networks (CapsNets) for object detection in Unmanned Aerial Vehicle-acquired images. Unlike CNNs, CapsNets extract and exploit the information content about objects’ relative position across several layers, which enables parsing crowded scenes with overlapping objects. Experimental results obtained on two datasets for car and solar panel detection problems show that CapsNets provide similar object detection accuracies when compared to state-of-the-art deep models with significantly reduced computational time. This is due to the fact that CapsNets emphasize dynamic routine instead of the depth.

Download Full-text

Learning agile and dynamic motor skills for legged robots

Science Robotics ◽

10.1126/scirobotics.aau5872 ◽

2019 ◽

Vol 4 (26) ◽

pp. eaau5872 ◽

Cited By ~ 93

Author(s):

Jemin Hwangbo ◽

Joonho Lee ◽

Alexey Dosovitskiy ◽

Dario Bellicoso ◽

Vassilios Tsounis ◽

...

Keyword(s):

Reinforcement Learning ◽

Motor Skills ◽

State Of The Art ◽

Control Policy ◽

Cost Effective ◽

Legged Robots ◽

Data Generation ◽

Natural Evolution ◽

Body Velocity ◽

High Level

Legged robots pose one of the greatest challenges in robotics. Dynamic and agile maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A compelling alternative is reinforcement learning, which requires minimal craftsmanship and promotes the natural evolution of a control policy. However, so far, reinforcement learning research for legged robots is mainly limited to simulation, and only few and comparably simple examples have been deployed on real systems. The primary reason is that training with real robots, particularly with dynamically balancing systems, is complicated and expensive. In the present work, we introduce a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. The approach is applied to the ANYmal robot, a sophisticated medium-dog–sized quadrupedal system. Using policies trained in simulation, the quadrupedal machine achieves locomotion skills that go beyond what had been achieved with prior methods: ANYmal is capable of precisely and energy-efficiently following high-level body velocity commands, running faster than before, and recovering from falling even in complex configurations.

Download Full-text

A Compressor Fouling Review Based on an Historical Survey of ASME Turbo Expo Papers

Journal of Turbomachinery ◽

10.1115/1.4035070 ◽

2017 ◽

Vol 139 (4) ◽

Cited By ~ 12

Author(s):

Alessio Suman ◽

Mirko Morini ◽

Nicola Aldi ◽

Nicola Casari ◽

Michele Pinelli ◽

...

Keyword(s):

Particle Deposition ◽

State Of The Art ◽

Operational Experience ◽

Technological Evolution ◽

Historical Survey ◽

Air Contaminants ◽

Compressor Inlet ◽

Asme Turbo Expo ◽

High Level ◽

First Time

Fouling afflicts gas turbine operation from first time application. Filtration systems and washing operations work against air contaminants in order to limit the particles entering the compressor inlet and remove the existing deposits. In this work, a global overview of the operational experience of the manufacturer, the filtration systems, and the particle deposition of the compressor are reported. The data reported in this review have been collected from 60 years (1956–2015) of ASME Turbo Expo proceedings. This conference is recognized as the must-attend event for turbomachinery professionals. Through the years, many issues have been resolved by the contributions of this conference. Regarding the compressor fouling phenomenon, the contributions presented at the ASME Turbo Expo mark the high level of development in this field of research, thanks to the simultaneous presence of manufacturers, government, and academia attendees. The goal of the authors is to describe the technological evolution and challenges faced by manufacturers and researchers through the years, highlighting the state of the art in the knowledge of fouling, and defining the background on which further studies will be based.

Download Full-text

Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/373 ◽

2019 ◽

Cited By ~ 10

Author(s):

Arthur Juliani ◽

Ahmed Khalifa ◽

Vincent-Pierre Berges ◽

Jonathan Harper ◽

Ervin Teng ◽

...

Keyword(s):

Video Games ◽

State Of The Art ◽

High Fidelity ◽

Level Control ◽

Board Games ◽

Current State ◽

Vision Control ◽

Rapid Pace ◽

High Level ◽

Planning Problems

The rapid pace of recent research in AI has been driven in part by the presence of fast and challenging simulation environments. These environments often take the form of games; with tasks ranging from simple board games, to competitive video games. We propose a new benchmark - Obstacle Tower: a high fidelity, 3D, 3rd person, procedurally generated environment. An agent in Obstacle Tower must learn to solve both low-level control and high-level planning problems in tandem while learning from pixels and a sparse reward signal. Unlike other benchmarks such as the Arcade Learning Environment, evaluation of agent performance in Obstacle Tower is based on an agent's ability to perform well on unseen instances of the environment. In this paper we outline the environment and provide a set of baseline results produced by current state-of-the-art Deep RL methods as well as human players. These algorithms fail to produce agents capable of performing near human level.

Download Full-text

A Confrontation between Two Doctrines: The Birth of Struggle for Hegemony in Hebrew Children's Literature during the 1930s and 1940s

International Research in Children s Literature ◽

10.3366/ircl.2008.0003 ◽

2008 ◽

Vol 1 (2) ◽

pp. 139-155 ◽

Cited By ~ 2

Author(s):

YAEL DARR

Keyword(s):

Children's Literature ◽

Educational System ◽

Single Channel ◽

Children’S Literature ◽

The Political ◽

Literary Creation ◽

History Of ◽

High Level ◽

First Time ◽

Literary Quality

This article describes a crucial and fundamental stage in the transformation of Hebrew children's literature, during the late 1930s and 1940s, from a single channel of expression to a multi-layered polyphony of models and voices. It claims that for the first time in the history of Hebrew children's literature there took place a doctrinal confrontation between two groups of taste-makers. The article outlines the pedagogical and ideological designs of traditionalist Zionist educators, and suggests how these were challenged by a group of prominent writers of adult poetry, members of the Modernist movement. These writers, it is argued, advocated autonomous literary creation, and insisted on a high level of literary quality. Their intervention not only dramatically changed the repertoire of Hebrew children's literature, but also the rules of literary discourse. The article suggests that, through the Modernists’ polemical efforts, Hebrew children's literature was able to free itself from its position as an apparatus controlled by the political-educational system and to become a dynamic and multi-layered field.

Download Full-text

A METHODOLOGY TO SUPPORT COMPANIES IN THE FIRST STEPS TOWARDS DE-MANUFACTURING

Proceedings of the Design Society ◽

10.1017/pds.2021.14 ◽

2021 ◽

Vol 1 ◽

pp. 131-140

Author(s):

Federica Cappelletti ◽

Marta Rossi ◽

Michele Germani ◽

Mohammad Shadman Hanif

Keyword(s):

Environmental Impact ◽

End Of Life ◽

Design Stage ◽

Life Strategies ◽

Disassembly Sequence ◽

The Status ◽

The Cost ◽

Technical Solutions ◽

First Time ◽

Complex Activities

AbstractDe-manufacturing and re-manufacturing are fundamental technical solutions to efficiently recover value from post-use products. Disassembly in one of the most complex activities in de-manufacturing because i) the more manual it is the higher is its cost, ii) disassembly times are variable due to uncertainty of conditions of products reaching their EoL, and iii) because it is necessary to know which components to disassemble to balance the cost of disassembly. The paper proposes a methodology that finds ways of applications: it can be applied at the design stage to detect space for product design improvements, and it also represents a baseline from organizations approaching de-manufacturing for the first time. The methodology consists of four main steps, in which firstly targets components are identified, according to their environmental impact; secondly their disassembly sequence is qualitatively evaluated, and successively it is quantitatively determined via disassembly times, predicting also the status of the component at their End of Life. The aim of the methodology is reached at the fourth phase when alternative, eco-friendlier End of Life strategies are proposed, verified, and chosen.

Download Full-text

Efficient End-to-End Sentence-Level Lipreading with Temporal Convolutional Networks

Applied Sciences ◽

10.3390/app11156975 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6975

Author(s):

Tao Zhang ◽

Lun He ◽

Xudong Li ◽

Guoqing Feng

Keyword(s):

Performance Improvement ◽

State Of The Art ◽

Error Rates ◽

Convolutional Network ◽

Convolutional Networks ◽

Sentence Level ◽

End To End ◽

High Level ◽

Improved Accuracy ◽

Talking Face

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.

Download Full-text

State-of-the-Art Char Production with a Focus on Bark Feedstocks: Processes, Design, and Applications

Processes ◽

10.3390/pr9010087 ◽

2021 ◽

Vol 9 (1) ◽

pp. 87

Author(s):

Ali Umut Şen ◽

Helena Pereira

Keyword(s):

State Of The Art ◽

Lignocellulosic Materials ◽

Heating Rates ◽

Production Methods ◽

Slow Pyrolysis ◽

Production Studies ◽

Lignocellulosic Feedstocks ◽

Thermochemical Method ◽

Superior Surface ◽

First Time

In recent years, there has been a surge of interest in char production from lignocellulosic biomass due to the fact of char’s interesting technological properties. Global char production in 2019 reached 53.6 million tons. Barks are among the most important and understudied lignocellulosic feedstocks that have a large potential for exploitation, given bark global production which is estimated to be as high as 400 million cubic meters per year. Chars can be produced from barks; however, in order to obtain the desired char yields and for simulation of the pyrolysis process, it is important to understand the differences between barks and woods and other lignocellulosic materials in addition to selecting a proper thermochemical method for bark-based char production. In this state-of-the-art review, after analyzing the main char production methods, barks were characterized for their chemical composition and compared with other important lignocellulosic materials. Following these steps, previous bark-based char production studies were analyzed, and different barks and process types were evaluated for the first time to guide future char production process designs based on bark feedstock. The dry and wet pyrolysis and gasification results of barks revealed that application of different particle sizes, heating rates, and solid residence times resulted in highly variable char yields between the temperature range of 220 °C and 600 °C. Bark-based char production should be primarily performed via a slow pyrolysis route, considering the superior surface properties of slow pyrolysis chars.

Download Full-text