Counter a Drone in a Complex Neighborhood Area by Deep Reinforcement Learning

Ender Çetin; Cristina Barrado; Enric Pastor

doi:10.3390/s20082320

Counter a Drone in a Complex Neighborhood Area by Deep Reinforcement Learning

Sensors ◽

10.3390/s20082320 ◽

2020 ◽

Vol 20 (8) ◽

pp. 2320 ◽

Cited By ~ 1

Author(s):

Ender Çetin ◽

Cristina Barrado ◽

Enric Pastor

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Training Transfer ◽

The State ◽

Neighborhood Environment ◽

Depth Image ◽

Depth Images ◽

Stationary Obstacles ◽

Time Required ◽

Moving Obstacle

Counter-drone technology by using artificial intelligence (AI) is an emerging technology and it is rapidly developing. Considering the recent advances in AI, counter-drone systems with AI can be very accurate and efficient to fight against drones. The time required to engage with the target can be less than other methods based on human intervention, such as bringing down a malicious drone by a machine-gun. Also, AI can identify and classify the target with a high precision in order to prevent a false interdiction with the targeted object. We believe that counter-drone technology with AI will bring important advantages to the threats coming from some drones and will help the skies to become safer and more secure. In this study, a deep reinforcement learning (DRL) architecture is proposed to counter a drone with another drone, the learning drone, which will autonomously avoid all kind of obstacles inside a suburban neighborhood environment. The environment in a simulator that has stationary obstacles such as trees, cables, parked cars, and houses. In addition, another non-malicious third drone, acting as moving obstacle inside the environment was also included. In this way, the learning drone is trained to detect stationary and moving obstacles, and to counter and catch the target drone without crashing with any other obstacle inside the neighborhood. The learning drone has a front camera and it can capture continuously depth images. Every depth image is part of the state used in DRL architecture. There are also scalar state parameters such as velocities, distances to the target, distances to some defined geofences and track, and elevation angles. The state image and scalars are processed by a neural network that joints the two state parts into a unique flow. Moreover, transfer learning is tested by using the weights of the first full-trained model. With transfer learning, one of the best jump-starts achieved higher mean rewards (close to 35 more) at the beginning of training. Transfer learning also shows that the number of crashes during training can be reduced, with a total number of crashed episodes reduced by 65%, when all ground obstacles are included.

Download Full-text

Transfer Reinforcement Learning Using Output-Gated Working Memory

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i02.5488 ◽

2020 ◽

Vol 34 (02) ◽

pp. 1324-1331

Author(s):

Arthur Williams ◽

Joshua Phillips

Keyword(s):

Working Memory ◽

Reinforcement Learning ◽

Transfer Learning ◽

Network Models ◽

Learning Performance ◽

Neural Network Models ◽

Learning Speed ◽

Initial Learning ◽

Time Required ◽

Partially Observable

Transfer learning allows for knowledge to generalize across tasks, resulting in increased learning speed and/or performance. These tasks must have commonalities that allow for knowledge to be transferred. The main goal of transfer learning in the reinforcement learning domain is to train and learn on one or more source tasks in order to learn a target task that exhibits better performance than if transfer was not used (Taylor and Stone 2009). Furthermore, the use of output-gated neural network models of working memory has been shown to increase generalization for supervised learning tasks (Kriete and Noelle 2011; Kriete et al. 2013). We propose that working memory-based generalization plays a significant role in a model's ability to transfer knowledge successfully across tasks. Thus, we extended the Holographic Working Memory Toolkit (HWMtk) (Dubois and Phillips 2017; Phillips and Noelle 2005) to utilize the generalization benefits of output gating within a working memory system. Finally, the model's utility was tested on a temporally extended, partially observable 5x5 2D grid-world maze task that required the agent to learn 3 tasks over the duration of the training period. The results indicate that the addition of output gating increases the initial learning performance of an agent in target tasks and decreases the learning time required to reach a fixed performance threshold.

Download Full-text

Location- and Person-Independent Activity Recognition with WiFi, Deep Neural Networks, and Reinforcement Learning

ACM Transactions on Internet of Things ◽

10.1145/3424739 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-25

Author(s):

Yongsen Ma ◽

Sheheryar Arshad ◽

Swetha Muniraju ◽

Eric Torkildson ◽

Enrico Rantala ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Reinforcement Learning ◽

Activity Recognition ◽

Deep Neural Networks ◽

State Machine ◽

Recognition Algorithm ◽

The State ◽

Neural Architecture ◽

Learning Agent

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.

Download Full-text

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Sensors ◽

10.3390/s21041299 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1299

Author(s):

Honglin Yuan ◽

Tim Hoogenkamp ◽

Remco C. Veltkamp

Keyword(s):

Pose Estimation ◽

Ground Truth ◽

3D Models ◽

Depth Image ◽

Great Success ◽

Estimation Algorithms ◽

Depth Images ◽

Object Pose Estimation ◽

Image Pairs ◽

Bounding Boxes

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.

Download Full-text

Experience Sharing Based Memetic Transfer Learning for Multiagent Reinforcement Learning

Memetic Computing ◽

10.1007/s12293-021-00339-4 ◽

2021 ◽

Author(s):

Tonghao Wang ◽

Xingguang Peng ◽

Yaochu Jin ◽

Demin Xu

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Multiagent Reinforcement Learning

Download Full-text

HRDepthNet: Depth Image-Based Marker-Less Tracking of Body Joints

Sensors ◽

10.3390/s21041356 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1356

Author(s):

Linda Christin Büker ◽

Finnja Zuber ◽

Andreas Hein ◽

Sebastian Fudickar

Keyword(s):

Color Images ◽

Depth Image ◽

Accuracy Evaluation ◽

Timed Up And Go ◽

Position Errors ◽

Depth Images ◽

Upper And Lower Extremities ◽

Rgb Images ◽

Human Joints ◽

Body Joints

With approaches for the detection of joint positions in color images such as HRNet and OpenPose being available, consideration of corresponding approaches for depth images is limited even though depth images have several advantages over color images like robustness to light variation or color- and texture invariance. Correspondingly, we introduce High- Resolution Depth Net (HRDepthNet)—a machine learning driven approach to detect human joints (body, head, and upper and lower extremities) in purely depth images. HRDepthNet retrains the original HRNet for depth images. Therefore, a dataset is created holding depth (and RGB) images recorded with subjects conducting the timed up and go test—an established geriatric assessment. The images were manually annotated RGB images. The training and evaluation were conducted with this dataset. For accuracy evaluation, detection of body joints was evaluated via COCO’s evaluation metrics and indicated that the resulting depth image-based model achieved better results than the HRNet trained and applied on corresponding RGB images. An additional evaluation of the position errors showed a median deviation of 1.619 cm (x-axis), 2.342 cm (y-axis) and 2.4 cm (z-axis).

Download Full-text

ANEGMA: an automated negotiation model for e-markets

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09513-x ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Pallavi Bagga ◽

Nicola Paoletti ◽

Bedour Alrayes ◽

Kostas Stathis

Keyword(s):

Reinforcement Learning ◽

Deep Neural Network ◽

Automated Negotiation ◽

Learning To Learn ◽

Negotiation Model ◽

Negotiation Strategies ◽

Exploration Time ◽

Model Free ◽

Bilateral Negotiations ◽

Time Required

AbstractWe present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.

Download Full-text

RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor

Mathematics ◽

10.3390/math9212815 ◽

2021 ◽

Vol 9 (21) ◽

pp. 2815

Author(s):

Shih-Hung Yang ◽

Yao-Mao Cheng ◽

Jyun-We Huang ◽

Yon-Ping Chen

Keyword(s):

Receptive Field ◽

American Sign Language ◽

Multiple Scales ◽

Receptive Fields ◽

Depth Image ◽

Small Scale ◽

Feature Maps ◽

Depth Sensor ◽

Depth Images ◽

Effective Transfer

Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations.

Download Full-text

Methods and Algorithms for Knowledge Reuse in Multiagent Reinforcement Learning

10.5753/ctd.2020.11360 ◽

2020 ◽

Author(s):

Felipe Leno Da Silva ◽

Anna Helena Reali Costa

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Learning Process ◽

Trial And Error ◽

Knowledge Reuse ◽

Previous Knowledge ◽

Learning Methods ◽

Types Of Knowledge ◽

Learning Agent ◽

Multiagent Reinforcement Learning

Reinforcement Learning (RL) is a powerful tool that has been used to solve increasingly complex tasks. RL operates through repeated interactions of the learning agent with the environment, via trial and error. However, this learning process is extremely slow, requiring many interactions. In this thesis, we leverage previous knowledge so as to accelerate learning in multiagent RL problems. We propose knowledge reuse both from previous tasks and from other agents. Several flexible methods are introduced so that each of these two types of knowledge reuse is possible. This thesis adds important steps towards more flexible and broadly applicable multiagent transfer learning methods.

Download Full-text

Applying a Deep Q Network for OpenAIs Car Racing Game

10.14293/s2199-1006.1.sor-.ppd7fvs.v1 ◽

2020 ◽

Author(s):

Ali Fakhry

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Transfer Learning ◽

State Of The Art ◽

Learning Techniques ◽

Car Racing ◽

Custom Made ◽

Learning Technique ◽

Reward Threshold

The applications of Deep Q-Networks are seen throughout the field of reinforcement learning, a large subsect of machine learning. Using a classic environment from OpenAI, CarRacing-v0, a 2D car racing environment, alongside a custom based modification of the environment, a DQN, Deep Q-Network, was created to solve both the classic and custom environments. The environments are tested using custom made CNN architectures and applying transfer learning from Resnet18. While DQNs were state of the art years ago, using it for CarRacing-v0 appears somewhat unappealing and not as effective as other reinforcement learning techniques. Overall, while the model did train and the agent learned various parts of the environment, attempting to reach the reward threshold for the environment with this reinforcement learning technique seems problematic and difficult as other techniques would be more useful.

Download Full-text

DGGAN: Depth-image Guided Generative Adversarial Networks for Disentangling RGB and Depth Images in 3D Hand Pose Estimation

2020 IEEE Winter Conference on Applications of Computer Vision (WACV) ◽

10.1109/wacv45572.2020.9093380 ◽

2020 ◽

Author(s):

Liangjian Chen ◽

Shih-Yao Lin ◽

Yusheng Xie ◽

Yen-Yu Lin ◽

Wei Fan ◽

...

Keyword(s):

Pose Estimation ◽

Depth Image ◽

Generative Adversarial Networks ◽

Hand Pose Estimation ◽

Image Guided ◽

Depth Images ◽

Adversarial Networks ◽

Hand Pose

Download Full-text