Visual Sorting of Express Parcels Based on Multi-Task Deep Learning

Song Han; Xiaoping Liu; Xing Han; Gang Wang; Shaobo Wu

doi:10.3390/s20236785

Visual Sorting of Express Parcels Based on Multi-Task Deep Learning

Sensors ◽

10.3390/s20236785 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6785

Author(s):

Song Han ◽

Xiaoping Liu ◽

Xing Han ◽

Gang Wang ◽

Shaobo Wu

Keyword(s):

Deep Learning ◽

Real Time ◽

Network Model ◽

Position Information ◽

Time Performance ◽

System A ◽

Complex Scenes ◽

Scale Variable ◽

Model Size ◽

Intelligent Logistics

Visual sorting of express parcels in complex scenes has always been a key issue in intelligent logistics sorting systems. With existing methods, it is still difficult to achieve fast and accurate sorting of disorderly stacked parcels. In order to achieve accurate detection and efficient sorting of disorderly stacked express parcels, we propose a robot sorting method based on multi-task deep learning. Firstly, a lightweight object detection network model is proposed to improve the real-time performance of the system. A scale variable and the joint weights of the network are used to sparsify the model and automatically identify unimportant channels. Pruning strategies are used to reduce the model size and increase the speed of detection without losing accuracy. Then, an optimal sorting position and pose estimation network model based on multi-task deep learning is proposed. Using an end-to-end network structure, the optimal sorting positions and poses of express parcels are estimated in real time by combining pose and position information for joint training. It is proved that this model can further improve the sorting accuracy. Finally, the accuracy and real-time performance of this method are verified by robotic sorting experiments.

Download Full-text

Front Vehicle Detection Algorithm for Smart Car Based on Improved SSD Model

Sensors ◽

10.3390/s20164646 ◽

2020 ◽

Vol 20 (16) ◽

pp. 4646 ◽

Cited By ~ 2

Author(s):

Jingwei Cao ◽

Chuanxue Song ◽

Shixin Song ◽

Silun Peng ◽

Da Wang ◽

...

Keyword(s):

Deep Learning ◽

Real Time ◽

Network Model ◽

Vehicle Detection ◽

Weather Conditions ◽

Detection Algorithm ◽

Single Shot ◽

Efficient Operation ◽

Time Performance ◽

Smart Cars

Vehicle detection is an indispensable part of environmental perception technology for smart cars. Aiming at the issues that conventional vehicle detection can be easily restricted by environmental conditions and cannot have accuracy and real-time performance, this article proposes a front vehicle detection algorithm for smart car based on improved SSD model. Single shot multibox detector (SSD) is one of the current mainstream object detection frameworks based on deep learning. This work first briefly introduces the SSD network model and analyzes and summarizes its problems and shortcomings in vehicle detection. Then, targeted improvements are performed to the SSD network model, including major advancements to the basic structure of the SSD model, the use of weighted mask in network training, and enhancement to the loss function. Finally, vehicle detection experiments are carried out on the basis of the KITTI vision benchmark suite and self-made vehicle dataset to observe the algorithm performance in different complicated environments and weather conditions. The test results based on the KITTI dataset show that the mAP value reaches 92.18%, and the average processing time per frame is 15 ms. Compared with the existing deep learning-based detection methods, the proposed algorithm can obtain accuracy and real-time performance simultaneously. Meanwhile, the algorithm has excellent robustness and environmental adaptability for complicated traffic environments and anti-jamming capabilities for bad weather conditions. These factors are of great significance to ensure the accurate and efficient operation of smart cars in real traffic scenarios and are beneficial to vastly reduce the incidence of traffic accidents and fully protect people’s lives and property.

Download Full-text

Lane Detection Algorithm for Intelligent Vehicles in Complex Road Conditions and Dynamic Environments

Sensors ◽

10.3390/s19143166 ◽

2019 ◽

Vol 19 (14) ◽

pp. 3166 ◽

Cited By ~ 8

Author(s):

Cao ◽

Song ◽

Xiao ◽

Peng

Keyword(s):

Deep Learning ◽

Real Time ◽

Processing Time ◽

Dynamic Environments ◽

Detection Algorithm ◽

Intelligent Vehicles ◽

Lane Detection ◽

Detection Accuracy ◽

Traditional Methods ◽

Time Performance

Lane detection is an important foundation in the development of intelligent vehicles. To address problems such as low detection accuracy of traditional methods and poor real-time performance of deep learning-based methodologies, a lane detection algorithm for intelligent vehicles in complex road conditions and dynamic environments was proposed. Firstly, converting the distorted image and using the superposition threshold algorithm for edge detection, an aerial view of the lane was obtained via region of interest extraction and inverse perspective transformation. Secondly, the random sample consensus algorithm was adopted to fit the curves of lane lines based on the third-order B-spline curve model, and fitting evaluation and curvature radius calculation were then carried out on the curve. Lastly, by using the road driving video under complex road conditions and the Tusimple dataset, simulation test experiments for lane detection algorithm were performed. The experimental results show that the average detection accuracy based on road driving video reached 98.49%, and the average processing time reached 21.5 ms. The average detection accuracy based on the Tusimple dataset reached 98.42%, and the average processing time reached 22.2 ms. Compared with traditional methods and deep learning-based methodologies, this lane detection algorithm had excellent accuracy and real-time performance, a high detection efficiency and a strong anti-interference ability. The accurate recognition rate and average processing time were significantly improved. The proposed algorithm is crucial in promoting the technological level of intelligent vehicle driving assistance and conducive to the further improvement of the driving safety of intelligent vehicles.

Download Full-text

Study on the Navigation of Patrol Robot of Transformer Substation Based on Color Vision and RFID

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.268-270.1259 ◽

2011 ◽

Vol 268-270 ◽

pp. 1259-1264 ◽

Cited By ~ 2

Author(s):

Song Hai Fan ◽

Dong Xue Xia

Keyword(s):

Real Time ◽

Color Vision ◽

Radio Frequency Identification ◽

Color Space ◽

Least Square Method ◽

Least Square ◽

Image Process ◽

Position Information ◽

Time Performance ◽

Transformer Substation

The robustness and real-time performance are of the greatest significance for the navigation of the patrol robot in the transformer substation. To meet this demand, a navigation and position approach is presented in this paper based on color vision and RFID(Radio Frequency Identification) technology. In the presented system, the position information is provided by RFID tags and navigation is completed by the extraction of guidelines. Based on the deep analysis of the advantages and shortcoming of different color space, a new approach integrating the good real-time performance of grayscale image process and rich information in color images process is presented to improve the robustness and real-time performance of navigation. Fast Hough transform is selected and combined with least square method to detect the navigation line. Experimental results show that the presented method can meet the real-time and robust demand of the navigation of patrol robot.

Download Full-text

Object Detection and Classification for Autonomous Drones

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f8862.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 3162-3165

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

State Of The Art ◽

High Accuracy ◽

Multiple Objects ◽

Time Performance ◽

Single Frame ◽

Learning Techniques

Detecting and classifying objects in a single frame which consists of several objects in a cumbersome task. With the advancement of deep learning techniques, the rate of accuracy has increased significantly. This paper aims to implement the state of the art custom algorithm for detection and classification of objects in a single frame with the goal of attaining high accuracy with a real time performance. The proposed system utilizes SSD architecture coupled with MobileNet to achieve maximum accuracy. The system will be fast enough to detect and recognize multiple objects even at 30 FPS.

Download Full-text

Pose recognition in sports scenes based on deep learning skeleton sequence model

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189834 ◽

2021 ◽

pp. 1-10

Author(s):

Chen Li-quan ◽

Li You ◽

Fengjun Shen ◽

Zhaoqimeng Shan ◽

Jiaxuan Chen

Keyword(s):

Deep Learning ◽

Real Time ◽

Target Detection ◽

Gesture Recognition ◽

Video Frame ◽

Skeleton Extraction ◽

Recognition Method ◽

Time Performance ◽

Basic Network ◽

Human Joints

Human skeleton extraction is a basic problem in the field of computer vision. With the rapid progress of science and technology, it has become a hot issue in the field of target detection such as pedestrian recognition, behavior monitoring, and pedestrian gesture recognition. In recent years, due to the development of deep neural networks, modeling of human joints in acquired images has made progress in skeleton extraction. However, most models have low modeling accuracy, poor real-time performance, and poor model availability. problem. Aiming at the above-mentioned human target detection problem, this paper uses the deep learning skeleton sequence model gesture recognition method in sports scenes to study, aiming to provide a gesture recognition method with strong noise resistance, good real-time performance and accurate model. This article uses motion video frame images to train the VGG16 network. Using the network to extract skeleton information can strengthen the posture feature expression, and use HOG for feature extraction, and use the Adam algorithm to optimize the network to extract more posture features, thereby improving the posture of the network Recognition accuracy. Then adjust the hyperparameters and network structure of the basic network according to the training results, and obtain the key poses in the sports scene through the final classifier.

Download Full-text

Real-time Aircraft Tracking System: A Survey and A Deep Learning Based Model

10.1109/isncc52172.2021.9615681 ◽

2021 ◽

Author(s):

Muhammed Emir cakici ◽

Feyza Yildirim Okay ◽

Suat Ozdemir

Keyword(s):

Deep Learning ◽

Real Time ◽

Tracking System ◽

System A ◽

Aircraft Tracking

Download Full-text

Real Time Performance Comparison of Multi-class Deep Learning Methods at the Edge

Artificial Intelligence and Applied Mathematics in Engineering Problems - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-36178-5_74 ◽

2020 ◽

pp. 853-867

Author(s):

Doruk Sonmez ◽

Aydin Cetin

Keyword(s):

Deep Learning ◽

Real Time ◽

Performance Comparison ◽

Learning Methods ◽

Time Performance

Download Full-text

Real-time embedded object detection and tracking system in Zynq SoC

EURASIP Journal on Image and Video Processing ◽

10.1186/s13640-021-00561-7 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Qingbo Ji ◽

Chong Dai ◽

Changbo Hou ◽

Xun Li

Keyword(s):

Real Time ◽

Single Shot ◽

Correlation Filters ◽

Detection Model ◽

Embedded Platform ◽

Time Performance ◽

Detection And Tracking ◽

Complex Scenes ◽

Real Time Tracking ◽

Kernel Correlation

AbstractWith the increasing application of computer vision technology in autonomous driving, robot, and other mobile devices, more and more attention has been paid to the implementation of target detection and tracking algorithms on embedded platforms. The real-time performance and robustness of algorithms are two hot research topics and challenges in this field. In order to solve the problems of poor real-time tracking performance of embedded systems using convolutional neural networks and low robustness of tracking algorithms for complex scenes, this paper proposes a fast and accurate real-time video detection and tracking algorithm suitable for embedded systems. The algorithm combines the object detection model of single-shot multibox detection in deep convolution networks and the kernel correlation filters tracking algorithm, what is more, it accelerates the single-shot multibox detection model using field-programmable gate arrays, which satisfies the real-time performance of the algorithm on the embedded platform. To solve the problem of model contamination after the kernel correlation filters algorithm fails to track in complex scenes, an improvement in the validity detection mechanism of tracking results is proposed that solves the problem of the traditional kernel correlation filters algorithm not being able to robustly track for a long time. In order to solve the problem that the missed rate of the single-shot multibox detection model is high under the conditions of motion blur or illumination variation, a strategy to reduce missed rate is proposed that effectively reduces the missed detection. The experimental results on the embedded platform show that the algorithm can achieve real-time tracking of the object in the video and can automatically reposition the object to continue tracking after the object tracking fails.

Download Full-text

GAN-Holo: Generative Adversarial Networks-Based Generated Holography Using Deep Learning

Complexity ◽

10.1155/2021/6662161 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Aamir Khan ◽

Zhang Zhijiang ◽

Yingjie Yu ◽

Muhammad Amir Khan ◽

Ketao Yan ◽

...

Keyword(s):

Deep Learning ◽

Real Time ◽

Computation Time ◽

Input Image ◽

Generative Adversarial Networks ◽

Second Phase ◽

Generative Adversarial Network ◽

Recovery Method ◽

Holographic Image ◽

Time Performance

Current development in a deep neural network (DNN) has given an opportunity to a novel framework for the reconstruction of a holographic image and a phase recovery method with real-time performance. There are many deep learning-based techniques that have been proposed for the holographic image reconstruction, but these deep learning-based methods can still lack in performance, time complexity, accuracy, and real-time performance. Due to iterative calculation, the generation of a CGH requires a long computation time. A novel deep generative adversarial network holography (GAN-Holo) framework is proposed for hologram reconstruction. This novel framework consists of two phases. In phase one, we used the Fresnel-based method to make the dataset. In the second phase, we trained the raw input image and holographic label image data from phase one acquired images. Our method has the capability of the noniterative process of computer-generated holograms (CGHs). The experimental results have demonstrated that the proposed method outperforms the existing methods.

Download Full-text

Processing chain for 3D histogram of gradients based real-time object recognition

International Journal of Advanced Robotic Systems ◽

10.1177/1729881420978363 ◽

2021 ◽

Vol 18 (1) ◽

pp. 172988142097836

Author(s):

Cristian Vilar ◽

Silvia Krug ◽

Benny Thörnberg

Keyword(s):

Deep Learning ◽

Object Recognition ◽

Real Time ◽

Recognition Accuracy ◽

3D Object Recognition ◽

Learning Approaches ◽

Data Set ◽

3D Object ◽

Time Performance ◽

Voxel Grid

3D object recognition has been a cutting-edge research topic since the popularization of depth cameras. These cameras enhance the perception of the environment and so are particularly suitable for autonomous robot navigation applications. Advanced deep learning approaches for 3D object recognition are based on complex algorithms and demand powerful hardware resources. However, autonomous robots and powered wheelchairs have limited resources, which affects the implementation of these algorithms for real-time performance. We propose to use instead a 3D voxel-based extension of the 2D histogram of oriented gradients (3DVHOG) as a handcrafted object descriptor for 3D object recognition in combination with a pose normalization method for rotational invariance and a supervised object classifier. The experimental goal is to reduce the overall complexity and the system hardware requirements, and thus enable a feasible real-time hardware implementation. This article compares the 3DVHOG object recognition rates with those of other 3D recognition approaches, using the ModelNet10 object data set as a reference. We analyze the recognition accuracy for 3DVHOG using a variety of voxel grid selections, different numbers of neurons ( Nh) in the single hidden layer feedforward neural network, and feature dimensionality reduction using principal component analysis. The experimental results show that the 3DVHOG descriptor achieves a recognition accuracy of 84.91% with a total processing time of 21.4 ms. Despite the lower recognition accuracy, this is close to the current state-of-the-art approaches for deep learning while enabling real-time performance.

Download Full-text