PORF-DDPG: Learning Personalized Autonomous Driving Behavior with Progressively Optimized Reward Function

In this thesis, we propose an environment perception framework for autonomous driving using deep reinforcement learning (DRL) that exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. Unlike existing techniques, our proposed technique takes the learning loss into account under deterministic as well as stochastic policy gradient. We apply DRL to object detection and safe navigation while enhancing a self-driving vehicle’s ability to discern meaningful information from surrounding data. For efficient environmental perception and object detection, various Q-learning based methods have been proposed in the literature. Unlike other works, this thesis proposes a collaborative deterministic as well as stochastic policy gradient based on DRL. Our technique is a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC) that adequately trains a self-driving vehicle. In this work, we focus on uninterrupted and reasonably safe autonomous driving without colliding with an obstacle or steering off the track. We propose a collaborative framework that utilizes best features of VAE, DDPG, and SAC and models autonomous driving as partly stochastic and partly deterministic policy gradient problem in continuous action space, and continuous state space. To ensure that the vehicle traverses the road over a considerable period of time, we employ a reward-penalty based system where a higher negative penalty is associated with an unfavourable action and a comparatively lower positive reward is awarded for favourable actions. We also examine the variations in policy loss, value loss, reward function, and cumulative reward for ‘VAE+DDPG’ and ‘VAE+SAC’ over the learning process.

Download Full-text

Safe Driving Of Autonomous Vehicles Through Improved Deep Reinforcement Learning

10.32920/17313137.v1 ◽

2021 ◽

Author(s):

Abhishek Gupta

Keyword(s):

Reinforcement Learning ◽

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Safe Driving ◽

The Road ◽

Reward Function ◽

Continuous State ◽

Value Loss ◽

Policy Gradient

In this thesis, we propose an environment perception framework for autonomous driving using deep reinforcement learning (DRL) that exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. Unlike existing techniques, our proposed technique takes the learning loss into account under deterministic as well as stochastic policy gradient. We apply DRL to object detection and safe navigation while enhancing a self-driving vehicle’s ability to discern meaningful information from surrounding data. For efficient environmental perception and object detection, various Q-learning based methods have been proposed in the literature. Unlike other works, this thesis proposes a collaborative deterministic as well as stochastic policy gradient based on DRL. Our technique is a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC) that adequately trains a self-driving vehicle. In this work, we focus on uninterrupted and reasonably safe autonomous driving without colliding with an obstacle or steering off the track. We propose a collaborative framework that utilizes best features of VAE, DDPG, and SAC and models autonomous driving as partly stochastic and partly deterministic policy gradient problem in continuous action space, and continuous state space. To ensure that the vehicle traverses the road over a considerable period of time, we employ a reward-penalty based system where a higher negative penalty is associated with an unfavourable action and a comparatively lower positive reward is awarded for favourable actions. We also examine the variations in policy loss, value loss, reward function, and cumulative reward for ‘VAE+DDPG’ and ‘VAE+SAC’ over the learning process.

Download Full-text

Evaluating the learning and performance characteristics of self-organizing systems with different task features

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s089006042100024x ◽

2021 ◽

pp. 1-19

Author(s):

Hao Ji ◽

Yan Jin

Keyword(s):

Systematic Evaluation ◽

Full Potential ◽

Learning Capability ◽

Reward Function ◽

Reward Functions ◽

Good Learning ◽

And Performance ◽

Generation Problem ◽

Self Organizing

Abstract Self-organizing systems (SOS) are developed to perform complex tasks in unforeseen situations with adaptability. Predefining rules for self-organizing agents can be challenging, especially in tasks with high complexity and changing environments. Our previous work has introduced a multiagent reinforcement learning (RL) model as a design approach to solving the rule generation problem of SOS. A deep multiagent RL algorithm was devised to train agents to acquire the task and self-organizing knowledge. However, the simulation was based on one specific task environment. Sensitivity of SOS to reward functions and systematic evaluation of SOS designed with multiagent RL remain an issue. In this paper, we introduced a rotation reward function to regulate agent behaviors during training and tested different weights of such reward on SOS performance in two case studies: box-pushing and T-shape assembly. Additionally, we proposed three metrics to evaluate the SOS: learning stability, quality of learned knowledge, and scalability. Results show that depending on the type of tasks; designers may choose appropriate weights of rotation reward to obtain the full potential of agents’ learning capability. Good learning stability and quality of knowledge can be achieved with an optimal range of team sizes. Scaling up to larger team sizes has better performance than scaling downwards.

Download Full-text

Influence of Autonomous Vehicles on Car-Following Behavior of Human Drivers

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119862628 ◽

2019 ◽

Vol 2673 (12) ◽

pp. 367-379 ◽

Cited By ~ 2

Author(s):

Yalda Rahmati ◽

Mohammadreza Khajeh Hosseini ◽

Alireza Talebpour ◽

Benjamin Swain ◽

Christopher Nelson

Keyword(s):

Autonomous Vehicles ◽

Simulation Models ◽

Driving Behavior ◽

Data Driven ◽

Microscopic Simulation ◽

Car Following ◽

The Road ◽

Human Driver ◽

Significant Difference ◽

On The Road

Despite numerous studies on general human–robot interactions, in the context of transportation, automated vehicle (AV)–human driver interaction is not a well-studied subject. These vehicles have fundamentally different decision-making logic compared with human drivers and the driving interactions between AVs and humans can potentially change traffic flow dynamics. Accordingly, through an experimental study, this paper investigates whether there is a difference between human–human and human–AV interactions on the road. This study focuses on car-following behavior and conducted several car-following experiments utilizing Texas A&M University’s automated Chevy Bolt. Utilizing NGSIM US-101 dataset, two scenarios for a platoon of three vehicles were considered. For both scenarios, the leader of the platoon follows a series of speed profiles extracted from the NGSIM dataset. The second vehicle in the platoon can be either another human-driven vehicle (scenario A) or an AV (scenario B). Data is collected from the third vehicle in the platoon to characterize the changes in driving behavior when following an AV. A data-driven and a model-based approach were used to identify possible changes in driving behavior from scenario A to scenario B. The findings suggested there is a statistically significant difference between human drivers’ behavior in these two scenarios and human drivers felt more comfortable following the AV. Simulation results also revealed the importance of capturing these changes in human behavior in microscopic simulation models of mixed driving environments.

Download Full-text

Road-Aware Trajectory Prediction for Autonomous Driving on Highways

Sensors ◽

10.3390/s20174703 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4703

Author(s):

Yookhyun Yoon ◽

Taeyeon Kim ◽

Ho Lee ◽

Jahnghyon Park

Keyword(s):

Deep Learning ◽

Autonomous Vehicles ◽

Prediction Method ◽

Autonomous Driving ◽

Structural Constraints ◽

High Definition ◽

Trajectory Prediction ◽

The Road ◽

Road Geometry ◽

Efficient Learning

For driving safely and comfortably, the long-term trajectory prediction of surrounding vehicles is essential for autonomous vehicles. For handling the uncertain nature of trajectory prediction, deep-learning-based approaches have been proposed previously. An on-road vehicle must obey road geometry, i.e., it should run within the constraint of the road shape. Herein, we present a novel road-aware trajectory prediction method which leverages the use of high-definition maps with a deep learning network. We developed a data-efficient learning framework for the trajectory prediction network in the curvilinear coordinate system of the road and a lane assignment for the surrounding vehicles. Then, we proposed a novel output-constrained sequence-to-sequence trajectory prediction network to incorporate the structural constraints of the road. Our method uses these structural constraints as prior knowledge for the prediction network. It is not only used as an input to the trajectory prediction network, but is also included in the constrained loss function of the maneuver recognition network. Accordingly, the proposed method can predict a feasible and realistic intention of the driver and trajectory. Our method has been evaluated using a real traffic dataset, and the results thus obtained show that it is data-efficient and can predict reasonable trajectories at merging sections.

Download Full-text

Analysis of Preference for Autonomous Driving Under Different Traffic Conditions Using a Driving Simulator

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2015.p0660 ◽

2015 ◽

Vol 27 (6) ◽

pp. 660-670 ◽

Cited By ~ 6

Author(s):

Udara Eshan Manawadu ◽

◽

Masaaki Ishikawa ◽

Mitsuhiro Kamezaki ◽

Shigeki Sugano ◽

...

Keyword(s):

Autonomous Vehicles ◽

Driving Simulator ◽

Autonomous Driving ◽

Driving Experience ◽

The Road ◽

Traffic Conditions ◽

Passenger Vehicles ◽

New Type ◽

On The Road ◽

Near Future

<div class=""abs_img""><img src=""[disp_template_path]/JRM/abst-image/00270006/08.jpg"" width=""300"" /> Driving simulator</div>Intelligent passenger vehicles with autonomous capabilities will be commonplace on our roads in the near future. These vehicles will reshape the existing relationship between the driver and vehicle. Therefore, to create a new type of rewarding relationship, it is important to analyze when drivers prefer autonomous vehicles to manually-driven (conventional) vehicles. This paper documents a driving simulator-based study conducted to identify the preferences and individual driving experiences of novice and experienced drivers of autonomous and conventional vehicles under different traffic and road conditions. We first developed a simplified driving simulator that could connect to different driver-vehicle interfaces (DVI). We then created virtual environments consisting of scenarios and events that drivers encounter in real-world driving, and we implemented fully autonomous driving. We then conducted experiments to clarify how the autonomous driving experience differed for the two groups. The results showed that experienced drivers opt for conventional driving overall, mainly due to the flexibility and driving pleasure it offers, while novices tend to prefer autonomous driving due to its inherent ease and safety. A further analysis indicated that drivers preferred to use both autonomous and conventional driving methods interchangeably, depending on the road and traffic conditions.

Download Full-text

Implementation of a Potential Field-Based Decision-Making Algorithm on Autonomous Vehicles for Driving in Complex Environments

Sensors ◽

10.3390/s19153318 ◽

2019 ◽

Vol 19 (15) ◽

pp. 3318 ◽

Cited By ~ 2

Author(s):

Carlos Martínez ◽

Felipe Jiménez

Keyword(s):

Autonomous Vehicles ◽

Vision System ◽

Autonomous Driving ◽

Road User ◽

Measurement Unit ◽

Maximum Speed ◽

The Road ◽

Planning Algorithm ◽

High Level ◽

Path Planning Algorithm

Autonomous driving is undergoing huge developments nowadays. It is expected that its implementation will bring many benefits. Autonomous cars must deal with tasks at different levels. Although some of them are currently solved, and perception systems provide quite an accurate and complete description of the environment, high-level decisions are hard to obtain in challenging scenarios. Moreover, they must comply with safety, reliability and predictability requirements, road user acceptance, and comfort specifications. This paper presents a path planning algorithm based on potential fields. Potential models are adjusted so that their behavior is appropriate to the environment and the dynamics of the vehicle and they can face almost any unexpected scenarios. The response of the system considers the road characteristics (e.g., maximum speed, lane line curvature, etc.) and the presence of obstacles and other users. The algorithm has been tested on an automated vehicle equipped with a GPS receiver, an inertial measurement unit and a computer vision system in real environments with satisfactory results.

Download Full-text

Driver-like decision-making method for vehicle longitudinal autonomous driving based on deep reinforcement learning

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/09544070211063081 ◽

2021 ◽

pp. 095440702110630

Author(s):

Zhenhai Gao ◽

Xiangtong Yan ◽

Fei Gao ◽

Lei He

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Autonomous Driving ◽

Decision Strategies ◽

Reward Function ◽

Human Driver ◽

Reward Functions ◽

A Current ◽

Better Than

Decision-making is one of the key parts of the research on vehicle longitudinal autonomous driving. Considering the behavior of human drivers when designing autonomous driving decision-making strategies is a current research hotspot. In longitudinal autonomous driving decision-making strategies, traditional rule-based decision-making strategies are difficult to apply to complex scenarios. Current decision-making methods that use reinforcement learning and deep reinforcement learning construct reward functions designed with safety, comfort, and economy. Compared with human drivers, the obtained decision strategies still have big gaps. Focusing on the above problems, this paper uses the driver’s behavior data to design the reward function of the deep reinforcement learning algorithm through BP neural network fitting, and uses the deep reinforcement learning DQN algorithm and the DDPG algorithm to establish two driver-like longitudinal autonomous driving decision-making models. The simulation experiment compares the decision-making effect of the two models with the driver curve. The results shows that the two algorithms can realize driver-like decision-making, and the consistency of the DDPG algorithm and human driver behavior is higher than that of the DQN algorithm, the effect of the DDPG algorithm is better than the DQN algorithm.

Download Full-text

Takeover Safety Analysis with Driver Monitoring Systems and Driver-Vehicle Interfaces in Highly Automated Vehicles

Applied Sciences ◽

10.3390/app11156685 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6685

Author(s):

Dongyeon Yu ◽

Chanho Park ◽

Hoseung Choi ◽

Donggyu Kim ◽

Sung-Ho Hwang

Keyword(s):

Monitoring System ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Clear Correlation ◽

Total System ◽

System Failure ◽

Automated Driving ◽

Minimum Risk ◽

Human In The Loop ◽

Driver Monitoring

According to SAE J3016, autonomous driving can be divided into six levels, and partially automated driving is possible from level three up. A partially or highly automated vehicle can encounter situations involving total system failure. Here, we studied a strategy for safe takeover in such situations. A human-in-the-loop simulator, driver-vehicle interface, and driver monitoring system were developed, and takeover experiments were performed using various driving scenarios and realistic autonomous driving situations. The experiments allowed us to draw the following conclusions. The visual–auditory–haptic complex alarm effectively delivered warnings and had a clear correlation with the user’s subjective preferences. There were scenario types in which the system had to immediately enter minimum risk maneuvers or emergency maneuvers without requesting takeover. Lastly, the risk of accidents can be reduced by the driver monitoring system that prevents the driver from being completely immersed in non-driving-related tasks. We proposed a safe takeover strategy from these results, which provides meaningful guidance for the development of autonomous vehicles. Considering the subjective questionnaire evaluations of users, it is expected to improve the acceptance of autonomous vehicles and increase the adoption of autonomous vehicles.

Download Full-text

Occlusion-Free Road Segmentation Leveraging Semantics for Autonomous Vehicles

Sensors ◽

10.3390/s19214711 ◽

2019 ◽

Vol 19 (21) ◽

pp. 4711 ◽

Cited By ~ 2

Author(s):

Kewei Wang ◽

Fuwu Yan ◽

Bin Zou ◽

Luqi Tang ◽

Quan Yuan ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Dynamic Scenes ◽

Comprehensive Understanding ◽

Global Context ◽

The Road ◽

Semantic Domain ◽

Road Segmentation

The deep convolutional neural network has led the trend of vision-based road detection, however, obtaining a full road area despite the occlusion from monocular vision remains challenging due to the dynamic scenes in autonomous driving. Inferring the occluded road area requires a comprehensive understanding of the geometry and the semantics of the visible scene. To this end, we create a small but effective dataset based on the KITTI dataset named KITTI-OFRS (KITTI-occlusion-free road segmentation) dataset and propose a lightweight and efficient, fully convolutional neural network called OFRSNet (occlusion-free road segmentation network) that learns to predict occluded portions of the road in the semantic domain by looking around foreground objects and visible road layout. In particular, the global context module is used to build up the down-sampling and joint context up-sampling block in our network, which promotes the performance of the network. Moreover, a spatially-weighted cross-entropy loss is designed to significantly increases the accuracy of this task. Extensive experiments on different datasets verify the effectiveness of the proposed approach, and comparisons with current excellent methods show that the proposed method outperforms the baseline models by obtaining a better trade-off between accuracy and runtime, which makes our approach is able to be applied to autonomous vehicles in real-time.

Download Full-text