Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning

The increasing trend of studying the innate softness of robotic structures and amalgamating it with the benefits of the extensive developments in the field of embodied intelligence has led to sprouting of a relatively new yet extremely rewarding sphere of technology. The fusion of current deep reinforcement algorithms with physical advantages of a soft bio-inspired structure certainly directs us to a fruitful prospect of designing completely self-sufficient agents that are capable of learning from observations collected from their environment to achieve a task they have been assigned. For soft robotics structure possessing countless degrees of freedom, it is often not easy (something not even possible) to formulate mathematical constraints necessary for training a deep reinforcement learning (DRL) agent for the task in hand, hence, we resolve to imitation learning techniques due to ease of manually performing such tasks like manipulation that could be comfortably mimicked by our agent. Deploying current imitation learning algorithms on soft robotic systems have been observed to provide satisfactory results but there are still challenges in doing so. This review article thus posits an overview of various such algorithms along with instances of them being applied to real world scenarios and yielding state-of-the-art results followed by brief descriptions on various pristine branches of DRL research that may be centers of future research in this field of interest.

Download Full-text

Deep Reinforcement Learning for Soft Robotic Applications: Brief Overview with Impending Challenges

10.20944/preprints201811.0510.v1 ◽

2018 ◽

Author(s):

Sarthak Bhagat ◽

Hritwick Banerjee ◽

Hongliang Ren

Keyword(s):

Reinforcement Learning ◽

Degrees Of Freedom ◽

State Of The Art ◽

Imitation Learning ◽

Robotic Systems ◽

Future Research ◽

Learning Techniques ◽

Increasing Trend ◽

Embodied Intelligence ◽

Robotic Applications

The increasing trend of studying the innate softness of robotic structures and amalgamating it with the benefits of the extensive developments in the field of embodied intelligence has led to sprouting of a relatively new yet extremely rewarding sphere of technology. The fusion of current deep reinforcement algorithms with physical advantages of a soft bio-inspired structure certainly directs us to a fruitful prospect of designing completely self-sufficient agents that are capable of learning from observations collected from their environment to achieve a task they have been assigned. For soft robotics structure possessing countless degrees of freedom, it is often not easy (something not even possible) to formulate mathematical constraints necessary for training a deep reinforcement learning (DRL) agent for the task in hand, hence, we resolve to imitation learning techniques due to ease of manually performing such tasks like manipulation that could be comfortably mimicked by our agent. Deploying current imitation learning algorithms on soft robotic systems have been observed to provide satisfactory results but there are still challenges in doing so. This review article thus posits an overview of various such algorithms along with instances of them being applied to real world scenarios and yielding state-of-the-art results followed by brief descriptions on various pristine branches of DRL research that may be centers of future research in this field of interest.

Download Full-text

Applying a Deep Q Network for OpenAIs Car Racing Game

10.14293/s2199-1006.1.sor-.ppd7fvs.v1 ◽

2020 ◽

Author(s):

Ali Fakhry

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Transfer Learning ◽

State Of The Art ◽

Learning Techniques ◽

Car Racing ◽

Custom Made ◽

Learning Technique ◽

Reward Threshold

The applications of Deep Q-Networks are seen throughout the field of reinforcement learning, a large subsect of machine learning. Using a classic environment from OpenAI, CarRacing-v0, a 2D car racing environment, alongside a custom based modification of the environment, a DQN, Deep Q-Network, was created to solve both the classic and custom environments. The environments are tested using custom made CNN architectures and applying transfer learning from Resnet18. While DQNs were state of the art years ago, using it for CarRacing-v0 appears somewhat unappealing and not as effective as other reinforcement learning techniques. Overall, while the model did train and the agent learned various parts of the environment, attempting to reach the reward threshold for the environment with this reinforcement learning technique seems problematic and difficult as other techniques would be more useful.

Download Full-text

Toward a Computational Model of Transfer

AI Magazine ◽

10.1609/aimag.v32i2.2337 ◽

2011 ◽

Vol 32 (2) ◽

pp. 126 ◽

Cited By ~ 1

Author(s):

Daniel Oblinger

Keyword(s):

Computational Model ◽

Transfer Learning ◽

Future Research ◽

Learning Program ◽

Research Challenges ◽

Program Research

This article focuses on a broad framing of the DARPA Transfer Learning Program research and an assessment of its progress, limitations, and challenges, from an admittedly personal but DARPA-influenced perspective. I will focus on a broad framing of TL that that will allow us to talk about this body of work as a whole, and use this to look towards work yet to be done in this area. I will consider both indicated application areas for transfer learning, as well as indicated future research challenges. With each of these I will also venture assessment of the "ripeness" of each of these subareas for follow on work-of course this assessment will be a very personal estimation based on the effort and progress made during the TL program.

Download Full-text

Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016722 ◽

2019 ◽

Vol 33 ◽

pp. 6722-6729 ◽

Cited By ~ 4

Author(s):

Ziming Li ◽

Julia Kiseleva ◽

Maarten De Rijke

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Imitation Learning ◽

Local Optimum ◽

Inverse Reinforcement Learning ◽

High Quality ◽

Overall Performance

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.

Download Full-text

MasakhaNER: Named Entity Recognition for African Languages

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00416 ◽

2021 ◽

Vol 9 ◽

pp. 1116-1131

Author(s):

David Ifeoluwa Adelani ◽

Jade Abbott ◽

Graham Neubig ◽

Daniel D’souza ◽

Julia Kreutzer ◽

...

Keyword(s):

Transfer Learning ◽

State Of The Art ◽

Empirical Evaluation ◽

Named Entity Recognition ◽

Entity Recognition ◽

Future Research ◽

African Continent ◽

High Quality ◽

African Languages ◽

Named Entity

Abstract We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state- of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.1

Download Full-text

Unsupervised Learning of KB Queries in Task-Oriented Dialogs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00372 ◽

2021 ◽

Vol 9 ◽

pp. 374-390

Author(s):

Dinesh Raghu ◽

Nikhil Gupta ◽

Mausam

Keyword(s):

Reinforcement Learning ◽

Knowledge Base ◽

State Of The Art ◽

The Novel ◽

Generate System ◽

User Intent ◽

Research Challenges ◽

Policy Optimization ◽

Task Oriented ◽

And Training

Abstract Task-oriented dialog (TOD) systems often need to formulate knowledge base (KB) queries corresponding to the user intent and use the query results to generate system responses. Existing approaches require dialog datasets to explicitly annotate these KB queries—these annotations can be time consuming, and expensive. In response, we define the novel problems of predicting the KB query and training the dialog agent, without explicit KB query annotation. For query prediction, we propose a reinforcement learning (RL) baseline, which rewards the generation of those queries whose KB results cover the entities mentioned in subsequent dialog. Further analysis reveals that correlation among query attributes in KB can significantly confuse memory augmented policy optimization (MAPO), an existing state of the art RL agent. To address this, we improve the MAPO baseline with simple but important modifications suited to our task. To train the full TOD system for our setting, we propose a pipelined approach: it independently predicts when to make a KB query (query position predictor), then predicts a KB query at the predicted position (query predictor), and uses the results of predicted query in subsequent dialog (next response predictor). Overall, our work proposes first solutions to our novel problem, and our analysis highlights the research challenges in training TOD systems without query annotation.

Download Full-text

Two-Phase Flow-Induced Vibrations : State of the Art and Future Research Challenges

The Proceedings of the Dynamics & Design Conference ◽

10.1299/jsmedmc.2012.a1 ◽

2012 ◽

Vol 2012 (0) ◽

pp. A1-A11

Author(s):

Njuki MUREITHI

Keyword(s):

Two Phase Flow ◽

State Of The Art ◽

Future Research ◽

Phase Flow ◽

Two Phase ◽

Research Challenges ◽

Flow Induced Vibrations

Download Full-text

Personalization and Context-awareness in Social Local Search: State-of-the-art and Future Research Challenges

Pervasive and Mobile Computing ◽

10.1016/j.pmcj.2016.04.004 ◽

2017 ◽

Vol 38 ◽

pp. 446-473 ◽

Cited By ~ 14

Author(s):

Fabio Gasparetti

Keyword(s):

Local Search ◽

Context Awareness ◽

State Of The Art ◽

Future Research ◽

Research Challenges

Download Full-text

Hybrid solar cells of conjugated polymers metal-oxide nanocrystals blends; state of the art and future research challenges in Indonesia

10.1063/1.4820274 ◽

2013 ◽

Cited By ~ 1

Author(s):

Ayi Bahtiar

Keyword(s):

Solar Cells ◽

Metal Oxide ◽

Conjugated Polymers ◽

State Of The Art ◽

Future Research ◽

Hybrid Solar Cells ◽

Research Challenges

Download Full-text