Transfer Reinforcement Learning for Autonomous Driving

Aravind Balakrishnan; Jaeyoung Lee; Ashish Gaurav; Krzysztof Czarnecki; Sean Sedwards

doi:10.1145/3449356

Transfer Reinforcement Learning for Autonomous Driving

ACM Transactions on Modeling and Computer Simulation ◽

10.1145/3449356 ◽

2021 ◽

Vol 31 (3) ◽

pp. 1-26

Author(s):

Aravind Balakrishnan ◽

Jaeyoung Lee ◽

Ashish Gaurav ◽

Krzysztof Czarnecki ◽

Sean Sedwards

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Transfer Problem ◽

Autonomous Driving ◽

High Fidelity ◽

Rule Based ◽

High Level ◽

Real Vehicle

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.

Download Full-text

Human-Machine Cooperative Trajectory Planning for Semi-Autonomous Driving Based on the Understanding of Behavioral Semantics

Electronics ◽

10.3390/electronics10080946 ◽

2021 ◽

Vol 10 (8) ◽

pp. 946

Author(s):

Bohan Jiang ◽

Xiaohui Li ◽

Yujun Zeng ◽

Daxue Liu

Keyword(s):

Decision Making ◽

Trajectory Planning ◽

Autonomous Driving ◽

Planning Problem ◽

Decision Level ◽

Planning Approach ◽

Decision Making Problem ◽

Generation Level ◽

High Level ◽

Real Vehicle

This paper presents a novel cooperative trajectory planning approach for semi-autonomous driving. The machine interacts with the driver at the decision level and the trajectory generation level. To minimize conflicts between the machine and the human, the trajectory planning problem is decomposed into a high-level behavior decision-making problem and a low-level trajectory planning problem. The approach infers the driver’s behavioral semantics according to the driving context and the driver’s input. The trajectories are generated based on the behavioral semantics and driver’s input. The feasibility of the proposed approach is validated by real vehicle experiments. The results prove that the proposed human–machine cooperative trajectory planning approach can successfully help the driver to avoid collisions while respecting the driver’s behavior.

Download Full-text

Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving

Robotics and Autonomous Systems ◽

10.1016/j.robot.2020.103568 ◽

2020 ◽

Vol 131 ◽

pp. 103568

Author(s):

Amarildo Likmeta ◽

Alberto Maria Metelli ◽

Andrea Tirinzoni ◽

Riccardo Giol ◽

Marcello Restelli ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Autonomous Driving ◽

Rule Based

Download Full-text

Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

2020 IEEE Intelligent Vehicles Symposium (IV) ◽

10.1109/iv47402.2020.9304614 ◽

2020 ◽

Author(s):

Carl-Johan Hoel ◽

Krister Wolff ◽

Leo Laine

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Autonomous Driving ◽

Uncertainty Estimation ◽

Tactical Decision

Download Full-text

ACRIPPER: A New Associative Classification Based on RIPPER Algorithm

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500131 ◽

2021 ◽

Vol 20 (01) ◽

pp. 2150013

Author(s):

Mohammed Abu-Arqoub ◽

Wael Hadi ◽

Abdelraouf Ishtaiwi

Keyword(s):

Decision Making ◽

Class Imbalance ◽

The Other ◽

Associative Classification ◽

Rule Based ◽

New Approach ◽

New Novel ◽

Accuracy Measure ◽

High Level ◽

Substantial Interest

Associative Classification (AC) classifiers are of substantial interest due to their ability to be utilised for mining vast sets of rules. However, researchers over the decades have shown that a large number of these mined rules are trivial, irrelevant, redundant, and sometimes harmful, as they can cause decision-making bias. Accordingly, in our paper, we address these challenges and propose a new novel AC approach based on the RIPPER algorithm, which we refer to as ACRIPPER. Our new approach combines the strength of the RIPPER algorithm with the classical AC method, in order to achieve: (1) a reduction in the number of rules being mined, especially those rules that are largely insignificant; (2) a high level of integration among the confidence and support of the rules on one hand and the class imbalance level in the prediction phase on the other hand. Our experimental results, using 20 different well-known datasets, reveal that the proposed ACRIPPER significantly outperforms the well-known rule-based algorithms RIPPER and J48. Moreover, ACRIPPER significantly outperforms the current AC-based algorithms CBA, CMAR, ECBA, FACA, and ACPRISM. Finally, ACRIPPER is found to achieve the best average and ranking on the accuracy measure.

Download Full-text

Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic

2019 Chinese Control Conference (CCC) ◽

10.23919/chicc.2019.8866005 ◽

2019 ◽

Author(s):

Zhengwei Bai ◽

Wei Shangguan ◽

Baigen Cai ◽

Linguo Chai

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Driving Behavior ◽

Heterogeneous Traffic ◽

High Level ◽

Decision Making Model

Download Full-text

Strategic Tasks for Explainable Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.330110007 ◽

2019 ◽

Vol 33 ◽

pp. 10007-10008 ◽

Cited By ~ 1

Author(s):

Rey Pocius ◽

Lawrence Neal ◽

Alan Fern

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Environment ◽

Strategic Decision ◽

Strategic Decision Making ◽

Sequential Decision Making ◽

Sequential Decision ◽

Level Control ◽

Mini Games ◽

High Level

Commonly used sequential decision making tasks such as the games in the Arcade Learning Environment (ALE) provide rich observation spaces suitable for deep reinforcement learning. However, they consist mostly of low-level control tasks which are of limited use for the development of explainable artificial intelligence(XAI) due to the fine temporal resolution of the tasks. Many of these domains also lack built-in high level abstractions and symbols. Existing tasks that provide for both strategic decision-making and rich observation spaces are either difficult to simulate or are intractable. We provide a set of new strategic decision-making tasks specialized for the development and evaluation of explainable AI methods, built as constrained mini-games within the StarCraft II Learning Environment.

Download Full-text

Logic-Based Sequential Decision-Making

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019995 ◽

2019 ◽

Vol 33 ◽

pp. 9995-9996

Author(s):

Daoming Lyu ◽

Fangkai Yang ◽

Bo Liu ◽

Daesub Yoon

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

High Dimensional ◽

Great Success ◽

Sequential Decision ◽

Sensory Inputs ◽

Hierarchical Decision ◽

High Level ◽

Data Efficiency ◽

Symbolic Planning

Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options. This framework features a planner – controller – meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.

Download Full-text

A Reinforcement Learning Approach with Spline-Fit Object Tracking for AIBO Robot’s High Level Decision Making

Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2011 - Studies in Computational Intelligence ◽

10.1007/978-3-642-22288-7_14 ◽

2011 ◽

pp. 169-183

Author(s):

Subhasis Mukherjee ◽

Shamsul Huda ◽

John Yearwood

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Object Tracking ◽

Learning Approach ◽

High Level

Download Full-text

Driver-like decision-making method for vehicle longitudinal autonomous driving based on deep reinforcement learning

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/09544070211063081 ◽

2021 ◽

pp. 095440702110630

Author(s):

Zhenhai Gao ◽

Xiangtong Yan ◽

Fei Gao ◽

Lei He

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Autonomous Driving ◽

Decision Strategies ◽

Reward Function ◽

Human Driver ◽

Reward Functions ◽

A Current ◽

Better Than

Decision-making is one of the key parts of the research on vehicle longitudinal autonomous driving. Considering the behavior of human drivers when designing autonomous driving decision-making strategies is a current research hotspot. In longitudinal autonomous driving decision-making strategies, traditional rule-based decision-making strategies are difficult to apply to complex scenarios. Current decision-making methods that use reinforcement learning and deep reinforcement learning construct reward functions designed with safety, comfort, and economy. Compared with human drivers, the obtained decision strategies still have big gaps. Focusing on the above problems, this paper uses the driver’s behavior data to design the reward function of the deep reinforcement learning algorithm through BP neural network fitting, and uses the deep reinforcement learning DQN algorithm and the DDPG algorithm to establish two driver-like longitudinal autonomous driving decision-making models. The simulation experiment compares the decision-making effect of the two models with the driver curve. The results shows that the two algorithms can realize driver-like decision-making, and the consistency of the DDPG algorithm and human driver behavior is higher than that of the DQN algorithm, the effect of the DDPG algorithm is better than the DQN algorithm.

Download Full-text

Actor–critic-based decision-making method for the artificial intelligence commander in tactical wargames

The Journal of Defense Modeling and Simulation Applications Methodology Technology ◽

10.1177/1548512920954542 ◽

2020 ◽

pp. 154851292095454

Author(s):

Junfeng Zhang ◽

Qing Xue

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Decision Making ◽

Reinforcement Learning ◽

Convolutional Neural Network ◽

Difficult Problem ◽

Learning Method ◽

Rule Based ◽

Autonomous Decision ◽

Decision Making Problem

In a tactical wargame, the decisions of the artificial intelligence (AI) commander are critical to the final combat result. Due to the existence of fog-of-war, AI commanders are faced with unknown and invisible information on the battlefield and lack of understanding of the situation, and it is difficult to make appropriate tactical strategies. The traditional knowledge rule-based decision-making method lacks flexibility and autonomy. How to make flexible and autonomous decision-making when facing complex battlefield situations is a difficult problem. This paper aims to solve the decision-making problem of the AI commander by using the deep reinforcement learning (DRL) method. We develop a tactical wargame as the research environment, which contains built-in script AI and supports the machine–machine combat mode. On this basis, an end-to-end actor–critic framework for commander decision making based on the convolutional neural network is designed to represent the battlefield situation and the reinforcement learning method is used to try different tactical strategies. Finally, we carry out a combat experiment between a DRL-based agent and a rule-based agent in a jungle terrain scenario. The result shows that the AI commander who adopts the actor–critic method successfully learns how to get a higher score in the tactical wargame, and the DRL-based agent has a higher winning ratio than the rule-based agent.

Download Full-text