scholarly journals Embedded YOLO: A Real-Time Object Detector for Small Intelligent Trajectory Cars

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
WenYu Feng ◽  
YuanFan Zhu ◽  
JunTai Zheng ◽  
Han Wang

YOLO-Tiny is a lightweight version of the object detection model based on the original “You only look once” (YOLO) model for simplifying network structure and reducing parameters, which makes it suitable for real-time applications. Although the YOLO-Tiny series, which includes YOLOv3-Tiny and YOLOv4-Tiny, can achieve real-time performance on a powerful GPU, it remains challenging to leverage this approach for real-time object detection on embedded computing devices, such as those in small intelligent trajectory cars. To obtain real-time and high-accuracy performance on these embedded devices, a novel object detection lightweight network called embedded YOLO is proposed in this paper. First, a new backbone network structure, ASU-SPP network, is proposed to enhance the effectiveness of low-level features. Then, we designed a simplified version of the neck network module PANet-Tiny that reduces computation complexity. Finally, in the detection head module, we use depthwise separable convolution to reduce the number of convolution stacks. In addition, the number of channels is reduced to 96 dimensions so that the module can attain the parallel acceleration of most inference frameworks. With its lightweight design, the proposed embedded YOLO model has only 3.53M parameters, and the average processing time can reach 155.1 frames per second, as verified by Baidu smart car target detection. At the same time, compared with YOLOv3-Tiny and YOLOv4-Tiny, the detection accuracy is 6% higher.

Author(s):  
Mohammad Javad Shaifee ◽  
Brendan Chywl ◽  
Francis Li ◽  
Alexander Wong

Object detection is considered one of the most challenging problemsin this field of computer vision, as it involves the combinationof object classification and object localization within a scene. Recently,deep neural networks (DNNs) have been demonstrated toachieve superior object detection performance compared to otherapproaches, with YOLOv2 (an improved You Only Look Once model)being one of the state-of-the-art in DNN-based object detectionmethods in terms of both speed and accuracy. Although YOLOv2can achieve real-time performance on a powerful GPU, it still remainsvery challenging for leveraging this approach for real-timeobject detection in video on embedded computing devices withlimited computational power and limited memory. In this paper,we propose a new framework called Fast YOLO, a fast You OnlyLook Once framework which accelerates YOLOv2 to be able toperform object detection in video on embedded devices in a realtimemanner. First, we leverage the evolutionary deep intelligenceframework to evolve the YOLOv2 network architecture and producean optimized architecture (referred to as O-YOLOv2 here) that has2.8X fewer parameters with just a 2% IOU drop. To further reducepower consumption on embedded devices while maintaining performance,a motion-adaptive inference method is introduced intothe proposed Fast YOLO framework to reduce the frequency ofdeep inference with O-YOLOv2 based on temporal motion characteristics.Experimental results show that the proposed Fast YOLOframework can reduce the number of deep inferences by an averageof 38.13%, and an average speedup of 3.3X for objectiondetection in video compared to the original YOLOv2, leading FastYOLO to run an average of 18FPS on a Nvidia Jetson TX1 embeddedsystem.


2021 ◽  
Vol 11 (3) ◽  
pp. 1096
Author(s):  
Qing Li ◽  
Yingcheng Lin ◽  
Wei He

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.


2021 ◽  
Vol 11 (8) ◽  
pp. 3531
Author(s):  
Hesham M. Eraqi ◽  
Karim Soliman ◽  
Dalia Said ◽  
Omar R. Elezaby ◽  
Mohamed N. Moustafa ◽  
...  

Extensive research efforts have been devoted to identify and improve roadway features that impact safety. Maintaining roadway safety features relies on costly manual operations of regular road surveying and data analysis. This paper introduces an automatic roadway safety features detection approach, which harnesses the potential of artificial intelligence (AI) computer vision to make the process more efficient and less costly. Given a front-facing camera and a global positioning system (GPS) sensor, the proposed system automatically evaluates ten roadway safety features. The system is composed of an oriented (or rotated) object detection model, which solves an orientation encoding discontinuity problem to improve detection accuracy, and a rule-based roadway safety evaluation module. To train and validate the proposed model, a fully-annotated dataset for roadway safety features extraction was collected covering 473 km of roads. The proposed method baseline results are found encouraging when compared to the state-of-the-art models. Different oriented object detection strategies are presented and discussed, and the developed model resulted in improving the mean average precision (mAP) by 16.9% when compared with the literature. The roadway safety feature average prediction accuracy is 84.39% and ranges between 91.11% and 63.12%. The introduced model can pervasively enable/disable autonomous driving (AD) based on safety features of the road; and empower connected vehicles (CV) to send and receive estimated safety features, alerting drivers about black spots or relatively less-safe segments or roads.


2020 ◽  
Vol 13 (1) ◽  
pp. 23
Author(s):  
Wei Zhao ◽  
William Yamada ◽  
Tianxin Li ◽  
Matthew Digman ◽  
Troy Runge

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.


Author(s):  
Runze Liu ◽  
Guangwei Yan ◽  
Hui He ◽  
Yubin An ◽  
Ting Wang ◽  
...  

Background: Power line inspection is essential to ensure the safe and stable operation of the power system. Object detection for tower equipment can significantly improve inspection efficiency. However, due to the low resolution of small targets and limited features, the detection accuracy of small targets is not easy to improve. Objective: This study aimed to improve the tiny targets’ resolution while making the small target's texture and detailed features more prominent to be perceived by the detection model. Methods: In this paper, we propose an algorithm that employs generative adversarial networks to improve small objects' detection accuracy. First, the original image is converted into a super-resolution one by a super-resolution reconstruction network (SRGAN). Then the object detection framework Faster RCNN is utilized to detect objects on the super-resolution images. Result: The experimental results on two small object recognition datasets show that the model proposed in this paper has good robustness. It can especially detect the targets missed by Faster RCNN, which indicates that SRGAN can effectively enhance the detailed information of small targets by improving the resolution. Conclusion: We found that higher resolution data is conducive to obtaining more detailed information of small targets, which can help the detection algorithm achieve higher accuracy. The small object detection model based on the generative adversarial network proposed in this paper is feasible and more efficient. Compared with Faster RCNN, this model has better performance on small object detection.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Zhaoli Wu ◽  
Xin Wang ◽  
Chao Chen

Due to the limitation of energy consumption and power consumption, the embedded platform cannot meet the real-time requirements of the far-infrared image pedestrian detection algorithm. To solve this problem, this paper proposes a new real-time infrared pedestrian detection algorithm (RepVGG-YOLOv4, Rep-YOLO), which uses RepVGG to reconstruct the YOLOv4 backbone network, reduces the amount of model parameters and calculations, and improves the speed of target detection; using space spatial pyramid pooling (SPP) obtains different receptive field information to improve the accuracy of model detection; using the channel pruning compression method reduces redundant parameters, model size, and computational complexity. The experimental results show that compared with the YOLOv4 target detection algorithm, the Rep-YOLO algorithm reduces the model volume by 90%, the floating-point calculation is reduced by 93.4%, the reasoning speed is increased by 4 times, and the model detection accuracy after compression reaches 93.25%.


Author(s):  
Vibhavari B Rao

The crime rates today can inevitably put a civilian's life in danger. While consistent efforts are being made to alleviate crime, there is also a dire need to create a smart and proactive surveillance system. Our project implements a smart surveillance system that would alert the authorities in real-time when a crime is being committed. During armed robberies and hostage situations, most often, the police cannot reach the place on time to prevent it from happening, owing to the lag in communication between the informants of the crime scene and the police. We propose an object detection model that implements deep learning algorithms to detect objects of violence such as pistols, knives, rifles from video surveillance footage, and in turn send real-time alerts to the authorities. There are a number of object detection algorithms being developed, each being evaluated under the performance metric mAP. On implementing Faster R-CNN with ResNet 101 architecture we found the mAP score to be about 91%. However, the downside to this is the excessive training and inferencing time it incurs. On the other hand, YOLOv5 architecture resulted in a model that performed very well in terms of speed. Its training speed was found to be 0.012 s / image during training but naturally, the accuracy was not as high as Faster R-CNN. With good computer architecture, it can run at about 40 fps. Thus, there is a tradeoff between speed and accuracy and it's important to strike a balance. We use transfer learning to improve accuracy by training the model on our custom dataset. This project can be deployed on any generic CCTV camera by setting up a live RTSP (real-time streaming protocol) and streaming the footage on a laptop or desktop where the deep learning model is being run.


Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6779
Author(s):  
Byung-Gil Han ◽  
Joon-Goo Lee ◽  
Kil-Taek Lim ◽  
Doo-Hyun Choi

With the increase in research cases of the application of a convolutional neural network (CNN)-based object detection technology, studies on the light-weight CNN models that can be performed in real time on the edge-computing devices are also increasing. This paper proposed scalable convolutional blocks that can be easily designed CNN networks of You Only Look Once (YOLO) detector which have the balanced processing speed and accuracy of the target edge-computing devices considering different performances by exchanging the proposed blocks simply. The maximum number of kernels of the convolutional layer was determined through simple but intuitive speed comparison tests for three edge-computing devices to be considered. The scalable convolutional blocks were designed in consideration of the limited maximum number of kernels to detect objects in real time on these edge-computing devices. Three scalable and fast YOLO detectors (SF-YOLO) which designed using the proposed scalable convolutional blocks compared the processing speed and accuracy with several conventional light-weight YOLO detectors on the edge-computing devices. When compared with YOLOv3-tiny, SF-YOLO was seen to be 2 times faster than the previous processing speed but with the same accuracy as YOLOv3-tiny, and also, a 48% improved processing speed than the YOLOv3-tiny-PRN which is the processing speed improvement model. Also, even in the large SF-YOLO model that focuses on the accuracy performance, it achieved a 10% faster processing speed with better accuracy of 40.4% [email protected] in the MS COCO dataset than YOLOv4-tiny model.


Electronics ◽  
2020 ◽  
Vol 9 (3) ◽  
pp. 451 ◽  
Author(s):  
Limin Guan ◽  
Yi Chen ◽  
Guiping Wang ◽  
Xu Lei

Vehicle detection is essential for driverless systems. However, the current single sensor detection mode is no longer sufficient in complex and changing traffic environments. Therefore, this paper combines camera and light detection and ranging (LiDAR) to build a vehicle-detection framework that has the characteristics of multi adaptability, high real-time capacity, and robustness. First, a multi-adaptive high-precision depth-completion method was proposed to convert the 2D LiDAR sparse depth map into a dense depth map, so that the two sensors are aligned with each other at the data level. Then, the You Only Look Once Version 3 (YOLOv3) real-time object detection model was used to detect the color image and the dense depth map. Finally, a decision-level fusion method based on bounding box fusion and improved Dempster–Shafer (D–S) evidence theory was proposed to merge the two results of the previous step and obtain the final vehicle position and distance information, which not only improves the detection accuracy but also improves the robustness of the whole framework. We evaluated our method using the KITTI dataset and the Waymo Open Dataset, and the results show the effectiveness of the proposed depth completion method and multi-sensor fusion strategy.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Zuopeng Zhao ◽  
Zhongxin Zhang ◽  
Xinzheng Xu ◽  
Yi Xu ◽  
Hualin Yan ◽  
...  

It is necessary to improve the performance of the object detection algorithm in resource-constrained embedded devices by lightweight improvement. In order to further improve the recognition accuracy of the algorithm for small target objects, this paper integrates 5 × 5 deep detachable convolution kernel on the basis of MobileNetV2-SSDLite model, extracts features of two special convolutional layers in addition to detecting the target, and designs a new lightweight object detection network—Lightweight Microscopic Detection Network (LMS-DN). The network can be implemented on embedded devices such as NVIDIA Jetson TX2. The experimental results show that LMS-DN only needs fewer parameters and calculation costs to obtain higher identification accuracy and stronger anti-interference than other popular object detection models.


Sign in / Sign up

Export Citation Format

Share Document