Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving

Wei Li; Kai Liu

doi:10.3390/s21072380

Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving

Sensors ◽

10.3390/s21072380 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2380

Author(s):

Wei Li ◽

Kai Liu

Keyword(s):

Object Detection ◽

Loss Function ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Average Precision ◽

Detection Model ◽

Multi Scale ◽

Proposed Model ◽

The Mean ◽

High Level

Object detection is an indispensable part of autonomous driving. It is the basis of other high-level applications. For example, autonomous vehicles need to use the object detection results to navigate and avoid obstacles. In this paper, we propose a multi-scale MobileNeck module and an algorithm to improve the performance of an object detection model by outputting a series of Gaussian parameters. These Gaussian parameters can be used to predict both the locations of detected objects and the localization confidences. Based on the above two methods, a new confidence-aware Mobile Detection (MobileDet) model is proposed. The MobileNeck module and loss function are easy to conduct and integrate with Generalized-IoU (GIoU) metrics with slight changes in the code. We test the proposed model on the KITTI and VOC datasets. The mean Average Precision (mAP) is improved by 3.8 on the KITTI dataset and 2.9 on the VOC dataset with less resource consumption.

Download Full-text

Collaborative Autonomous Driving—A Survey of Solution Approaches and Future Challenges

Sensors ◽

10.3390/s21113783 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3783

Author(s):

Sumbal Malik ◽

Manzoor Ahmed Khan ◽

Hesham El-Sayed

Keyword(s):

Autonomous Vehicles ◽

Communication Technologies ◽

Autonomous Driving ◽

Use Cases ◽

Cooperative Driving ◽

Vehicle To Vehicle ◽

Future Challenges ◽

Vehicle To Infrastructure ◽

High Level ◽

Intersection Management

Sooner than expected, roads will be populated with a plethora of connected and autonomous vehicles serving diverse mobility needs. Rather than being stand-alone, vehicles will be required to cooperate and coordinate with each other, referred to as cooperative driving executing the mobility tasks properly. Cooperative driving leverages Vehicle to Vehicle (V2V) and Vehicle to Infrastructure (V2I) communication technologies aiming to carry out cooperative functionalities: (i) cooperative sensing and (ii) cooperative maneuvering. To better equip the readers with background knowledge on the topic, we firstly provide the detailed taxonomy section describing the underlying concepts and various aspects of cooperation in cooperative driving. In this survey, we review the current solution approaches in cooperation for autonomous vehicles, based on various cooperative driving applications, i.e., smart car parking, lane change and merge, intersection management, and platooning. The role and functionality of such cooperation become more crucial in platooning use-cases, which is why we also focus on providing more details of platooning use-cases and focus on one of the challenges, electing a leader in high-level platooning. Following, we highlight a crucial range of research gaps and open challenges that need to be addressed before cooperative autonomous vehicles hit the roads. We believe that this survey will assist the researchers in better understanding vehicular cooperation, its various scenarios, solution approaches, and challenges.

Download Full-text

Automatic Roadway Features Detection with Oriented Object Detection

Applied Sciences ◽

10.3390/app11083531 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3531

Author(s):

Hesham M. Eraqi ◽

Karim Soliman ◽

Dalia Said ◽

Omar R. Elezaby ◽

Mohamed N. Moustafa ◽

...

Keyword(s):

Object Detection ◽

Safety Evaluation ◽

Autonomous Driving ◽

Detection Accuracy ◽

The Road ◽

Detection Model ◽

Detection Approach ◽

Roadway Safety ◽

Safety Features ◽

Oriented Object

Extensive research efforts have been devoted to identify and improve roadway features that impact safety. Maintaining roadway safety features relies on costly manual operations of regular road surveying and data analysis. This paper introduces an automatic roadway safety features detection approach, which harnesses the potential of artificial intelligence (AI) computer vision to make the process more efficient and less costly. Given a front-facing camera and a global positioning system (GPS) sensor, the proposed system automatically evaluates ten roadway safety features. The system is composed of an oriented (or rotated) object detection model, which solves an orientation encoding discontinuity problem to improve detection accuracy, and a rule-based roadway safety evaluation module. To train and validate the proposed model, a fully-annotated dataset for roadway safety features extraction was collected covering 473 km of roads. The proposed method baseline results are found encouraging when compared to the state-of-the-art models. Different oriented object detection strategies are presented and discussed, and the developed model resulted in improving the mean average precision (mAP) by 16.9% when compared with the literature. The roadway safety feature average prediction accuracy is 84.39% and ranges between 91.11% and 63.12%. The introduced model can pervasively enable/disable autonomous driving (AD) based on safety features of the road; and empower connected vehicles (CV) to send and receive estimated safety features, alerting drivers about black spots or relatively less-safe segments or roads.

Download Full-text

A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

Applied Sciences ◽

10.3390/app11136016 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6016

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Detection Algorithm ◽

Detection Accuracy ◽

Cloud Data ◽

Detection Techniques ◽

Bounding Boxes ◽

Partially Occluded ◽

Rgb Image

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Download Full-text

An Improved Bounding Box Regression Loss Function Based on CIOU Loss for Multi-scale Object Detection

10.1109/prml52754.2021.9520717 ◽

2021 ◽

Author(s):

Shuangjiang Du ◽

Baofu Zhang ◽

Pin Zhang ◽

Peng Xiang

Keyword(s):

Object Detection ◽

Loss Function ◽

Multi Scale ◽

Bounding Box

Download Full-text

3D-GIoU: 3D Generalized Intersection over Union for Object Detection in Point Cloud

Sensors ◽

10.3390/s19194093 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4093 ◽

Cited By ~ 7

Author(s):

Jun Xu ◽

Yanxin Ma ◽

Songhua He ◽

Jiahua Zhu

Keyword(s):

Object Detection ◽

Point Cloud ◽

Pedestrian Detection ◽

Three Dimensional ◽

Average Precision ◽

3D Object ◽

Automatic Driving ◽

3D Computer Vision ◽

High Level ◽

3D Object Detection

Three-dimensional (3D) object detection is an important research in 3D computer vision with significant applications in many fields, such as automatic driving, robotics, and human–computer interaction. However, the low precision is an urgent problem in the field of 3D object detection. To solve it, we present a framework for 3D object detection in point cloud. To be specific, a designed Backbone Network is used to make fusion of low-level features and high-level features, which makes full use of various information advantages. Moreover, the two-dimensional (2D) Generalized Intersection over Union is extended to 3D use as part of the loss function in our framework. Empirical experiments of Car, Cyclist, and Pedestrian detection have been conducted respectively on the KITTI benchmark. Experimental results with average precision (AP) have shown the effectiveness of the proposed network.

Download Full-text

Object Detection Based on Region Decomposition and Assembly

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018094 ◽

2019 ◽

Vol 33 ◽

pp. 8094-8101 ◽

Cited By ~ 4

Author(s):

Seung-Hwan Bae

Keyword(s):

Neural Networks ◽

Object Detection ◽

Performance Improvement ◽

Semantic Relations ◽

Detection Accuracy ◽

Semantic Features ◽

Multi Scale ◽

Object Proposals ◽

Object Region ◽

High Level

Region-based object detection infers object regions for one or more categories in an image. Due to the recent advances in deep learning and region proposal methods, object detectors based on convolutional neural networks (CNNs) have been flourishing and provided the promising detection results. However, the detection accuracy is degraded often because of the low discriminability of object CNN features caused by occlusions and inaccurate region proposals. In this paper, we therefore propose a region decomposition and assembly detector (R-DAD) for more accurate object detection.In the proposed R-DAD, we first decompose an object region into multiple small regions. To capture an entire appearance and part details of the object jointly, we extract CNN features within the whole object region and decomposed regions. We then learn the semantic relations between the object and its parts by combining the multi-region features stage by stage with region assembly blocks, and use the combined and high-level semantic features for the object classification and localization. In addition, for more accurate region proposals, we propose a multi-scale proposal layer that can generate object proposals of various scales. We integrate the R-DAD into several feature extractors, and prove the distinct performance improvement on PASCAL07/12 and MSCOCO18 compared to the recent convolutional detectors.

Download Full-text

Development and Testing of an Autonomous Driving Module for Critical Driving Conditions

Volume 17: Transportation Systems ◽

10.1115/imece2008-68487 ◽

2008 ◽

Author(s):

Francesco Biral ◽

Enrico Bertolazzi ◽

Daniele Bortoluzzi ◽

Paolo Bosetti

Keyword(s):

Autonomous Vehicles ◽

Gasoline Engine ◽

Autonomous Driving ◽

Great Effort ◽

Control Laws ◽

Test Platform ◽

High Level ◽

And Control ◽

Nonlinear Receding Horizon Control ◽

High Range

In the last years a great effort has been devoted to the development of autonomous vehicles able to drive in a high range of speeds in semi-structured and unstructured environments. This article presents and discusses the software framework for Hardware-In-the-Loop (HIL) and Software-In-the-Loop (SIL) analysis that has been designed for developing and testing of control laws and mission functionalities of semi-autonomous and autonomous vehicles. The ultimate goal of this project is to develop a robotic system, named RUMBy, able to autonomously plan and execute accurate optimal manoeuvres both in normal and in critical driving situations and to be used as a test platform for advanced decision and autonomous driving algorithms. RUMBy’s hardware is a 1:6 scale gasoline engine R/C car with onboard telemetry and control systems. RUMBy’s software consists of three main modules: the manager module that coordinates the other modules and take high level decision; the motion planner module which is based on a Nonlinear Receding Horizon Control (NRHC) algorithm; the actuation module that produces the driving command for the vehicle. The article describes the details of RUMBy architecture and discusses its modular configuration that easily allows HIL and SIL tests.

Download Full-text

A Region-Based Efficient Network for Accurate Object Detection

Traitement du signal ◽

10.18280/ts.380228 ◽

2021 ◽

Vol 38 (2) ◽

pp. 481-494

Author(s):

Yurong Guan ◽

Muhammad Aamir ◽

Zhihua Hu ◽

Waheed Ahmed Abro ◽

Ziaur Rahman ◽

...

Keyword(s):

Object Detection ◽

Detection Efficiency ◽

Visual Object ◽

Average Precision ◽

High Quality ◽

Image Objects ◽

Object Proposal ◽

Proposed Model ◽

Image Object Detection ◽

Image Object

Object detection in images is an important task in image processing and computer vision. Many approaches are available for object detection. For example, there are numerous algorithms for object positioning and classification in images. However, the current methods perform poorly and lack experimental verification. Thus, it is a fascinating and challenging issue to position and classify image objects. Drawing on the recent advances in image object detection, this paper develops a region-baed efficient network for accurate object detection in images. To improve the overall detection performance, image object detection was treated as a twofold problem, involving object proposal generation and object classification. First, a framework was designed to generate high-quality, class-independent, accurate proposals. Then, these proposals, together with their input images, were imported to our network to learn convolutional features. To boost detection efficiency, the number of proposals was reduced by a network refinement module, leaving only a few eligible candidate proposals. After that, the refined candidate proposals were loaded into the detection module to classify the objects. The proposed model was tested on the test set of the famous PASCAL Visual Object Classes Challenge 2007 (VOC2007). The results clearly demonstrate that our model achieved robust overall detection efficiency over existing approaches using fewer or more proposals, in terms of recall, mean average best overlap (MABO), and mean average precision (mAP).

Download Full-text

HCNET: A Point Cloud Object Detection Network Based on Height and Channel Attention

Remote Sensing ◽

10.3390/rs13245071 ◽

2021 ◽

Vol 13 (24) ◽

pp. 5071

Author(s):

Jing Zhang ◽

Jiajun Wang ◽

Da Xu ◽

Yunsong Li

Keyword(s):

Object Detection ◽

Point Cloud ◽

Feature Fusion ◽

Three Dimensional ◽

Point Clouds ◽

Autonomous Driving ◽

Attention Mechanism ◽

Uneven Distribution ◽

Adaptive Adjustment ◽

High Level

The use of LiDAR point clouds for accurate three-dimensional perception is crucial for realizing high-level autonomous driving systems. Upon considering the drawbacks of the current point cloud object-detection algorithms, this paper proposes HCNet, an algorithm that combines an attention mechanism with adaptive adjustment, starting from feature fusion and overcoming the sparse and uneven distribution of point clouds. Inspired by the basic idea of an attention mechanism, a feature-fusion structure HC module with height attention and channel attention, weighted in parallel, is proposed to perform feature-fusion on multiple pseudo images. The use of several weighting mechanisms enhances the ability of feature-information expression. Additionally, we designed an adaptively adjusted detection head that also overcomes the sparsity of the point cloud from the perspective of original information fusion. It reduces the interference caused by the uneven distribution of the point cloud from the perspective of adaptive adjustment. The results show that our HCNet has better accuracy than other one-stage-network or even two-stage-network RCNNs under some evaluation detection metrics. Additionally, it has a detection rate of 30FPS. Especially for hard samples, the algorithm in this paper has better detection performance than many existing algorithms.

Download Full-text

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6945 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12557-12564 ◽

Cited By ~ 4

Author(s):

Zhenbo Xu ◽

Wei Zhang ◽

Xiaoqing Ye ◽

Xiao Tan ◽

Wei Yang ◽

...

Keyword(s):

Object Detection ◽

Point Clouds ◽

Autonomous Driving ◽

Disparity Estimation ◽

3D Object ◽

Detection Model ◽

Occluded Objects ◽

Bounding Boxes ◽

Detection Quality ◽

3D Object Detection

3D object detection is an essential task in autonomous driving and robotics. Though great progress has been made, challenges remain in estimating 3D pose for distant and occluded objects. In this paper, we present a novel framework named ZoomNet for stereo imagery-based 3D detection. The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes. To further exploit the abundant texture cues in rgb images for more accurate disparity estimation, we introduce a conceptually straight-forward module – adaptive zooming, which simultaneously resizes 2D instance bounding boxes to a unified resolution and adjusts the camera intrinsic parameters accordingly. In this way, we are able to estimate higher-quality disparity maps from the resized box images then construct dense point clouds for both nearby and distant objects. Moreover, we introduce to learn part locations as complementary features to improve the resistance against occlusion and put forward the 3D fitting score to better estimate the 3D detection quality. Extensive experiments on the popular KITTI 3D detection dataset indicate ZoomNet surpasses all previous state-of-the-art methods by large margins (improved by 9.4% on APbv (IoU=0.7) over pseudo-LiDAR). Ablation study also demonstrates that our adaptive zooming strategy brings an improvement of over 10% on AP3d (IoU=0.7). In addition, since the official KITTI benchmark lacks fine-grained annotations like pixel-wise part locations, we also present our KFG dataset by augmenting KITTI with detailed instance-wise annotations including pixel-wise part location, pixel-wise disparity, etc.. Both the KFG dataset and our codes will be publicly available at https://github.com/detectRecog/ZoomNet.

Download Full-text