scholarly journals Visual Sorting of Express Parcels Based on Multi-Task Deep Learning

Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6785
Author(s):  
Song Han ◽  
Xiaoping Liu ◽  
Xing Han ◽  
Gang Wang ◽  
Shaobo Wu

Visual sorting of express parcels in complex scenes has always been a key issue in intelligent logistics sorting systems. With existing methods, it is still difficult to achieve fast and accurate sorting of disorderly stacked parcels. In order to achieve accurate detection and efficient sorting of disorderly stacked express parcels, we propose a robot sorting method based on multi-task deep learning. Firstly, a lightweight object detection network model is proposed to improve the real-time performance of the system. A scale variable and the joint weights of the network are used to sparsify the model and automatically identify unimportant channels. Pruning strategies are used to reduce the model size and increase the speed of detection without losing accuracy. Then, an optimal sorting position and pose estimation network model based on multi-task deep learning is proposed. Using an end-to-end network structure, the optimal sorting positions and poses of express parcels are estimated in real time by combining pose and position information for joint training. It is proved that this model can further improve the sorting accuracy. Finally, the accuracy and real-time performance of this method are verified by robotic sorting experiments.

Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4646 ◽  
Author(s):  
Jingwei Cao ◽  
Chuanxue Song ◽  
Shixin Song ◽  
Silun Peng ◽  
Da Wang ◽  
...  

Vehicle detection is an indispensable part of environmental perception technology for smart cars. Aiming at the issues that conventional vehicle detection can be easily restricted by environmental conditions and cannot have accuracy and real-time performance, this article proposes a front vehicle detection algorithm for smart car based on improved SSD model. Single shot multibox detector (SSD) is one of the current mainstream object detection frameworks based on deep learning. This work first briefly introduces the SSD network model and analyzes and summarizes its problems and shortcomings in vehicle detection. Then, targeted improvements are performed to the SSD network model, including major advancements to the basic structure of the SSD model, the use of weighted mask in network training, and enhancement to the loss function. Finally, vehicle detection experiments are carried out on the basis of the KITTI vision benchmark suite and self-made vehicle dataset to observe the algorithm performance in different complicated environments and weather conditions. The test results based on the KITTI dataset show that the mAP value reaches 92.18%, and the average processing time per frame is 15 ms. Compared with the existing deep learning-based detection methods, the proposed algorithm can obtain accuracy and real-time performance simultaneously. Meanwhile, the algorithm has excellent robustness and environmental adaptability for complicated traffic environments and anti-jamming capabilities for bad weather conditions. These factors are of great significance to ensure the accurate and efficient operation of smart cars in real traffic scenarios and are beneficial to vastly reduce the incidence of traffic accidents and fully protect people’s lives and property.


Sensors ◽  
2019 ◽  
Vol 19 (14) ◽  
pp. 3166 ◽  
Author(s):  
Cao ◽  
Song ◽  
Song ◽  
Xiao ◽  
Peng

Lane detection is an important foundation in the development of intelligent vehicles. To address problems such as low detection accuracy of traditional methods and poor real-time performance of deep learning-based methodologies, a lane detection algorithm for intelligent vehicles in complex road conditions and dynamic environments was proposed. Firstly, converting the distorted image and using the superposition threshold algorithm for edge detection, an aerial view of the lane was obtained via region of interest extraction and inverse perspective transformation. Secondly, the random sample consensus algorithm was adopted to fit the curves of lane lines based on the third-order B-spline curve model, and fitting evaluation and curvature radius calculation were then carried out on the curve. Lastly, by using the road driving video under complex road conditions and the Tusimple dataset, simulation test experiments for lane detection algorithm were performed. The experimental results show that the average detection accuracy based on road driving video reached 98.49%, and the average processing time reached 21.5 ms. The average detection accuracy based on the Tusimple dataset reached 98.42%, and the average processing time reached 22.2 ms. Compared with traditional methods and deep learning-based methodologies, this lane detection algorithm had excellent accuracy and real-time performance, a high detection efficiency and a strong anti-interference ability. The accurate recognition rate and average processing time were significantly improved. The proposed algorithm is crucial in promoting the technological level of intelligent vehicle driving assistance and conducive to the further improvement of the driving safety of intelligent vehicles.


2011 ◽  
Vol 268-270 ◽  
pp. 1259-1264 ◽  
Author(s):  
Song Hai Fan ◽  
Dong Xue Xia

The robustness and real-time performance are of the greatest significance for the navigation of the patrol robot in the transformer substation. To meet this demand, a navigation and position approach is presented in this paper based on color vision and RFID(Radio Frequency Identification) technology. In the presented system, the position information is provided by RFID tags and navigation is completed by the extraction of guidelines. Based on the deep analysis of the advantages and shortcoming of different color space, a new approach integrating the good real-time performance of grayscale image process and rich information in color images process is presented to improve the robustness and real-time performance of navigation. Fast Hough transform is selected and combined with least square method to detect the navigation line. Experimental results show that the presented method can meet the real-time and robust demand of the navigation of patrol robot.


2020 ◽  
Vol 8 (6) ◽  
pp. 3162-3165

Detecting and classifying objects in a single frame which consists of several objects in a cumbersome task. With the advancement of deep learning techniques, the rate of accuracy has increased significantly. This paper aims to implement the state of the art custom algorithm for detection and classification of objects in a single frame with the goal of attaining high accuracy with a real time performance. The proposed system utilizes SSD architecture coupled with MobileNet to achieve maximum accuracy. The system will be fast enough to detect and recognize multiple objects even at 30 FPS.


2021 ◽  
pp. 1-10
Author(s):  
Chen Li-quan ◽  
Li You ◽  
Fengjun Shen ◽  
Zhaoqimeng Shan ◽  
Jiaxuan Chen

Human skeleton extraction is a basic problem in the field of computer vision. With the rapid progress of science and technology, it has become a hot issue in the field of target detection such as pedestrian recognition, behavior monitoring, and pedestrian gesture recognition. In recent years, due to the development of deep neural networks, modeling of human joints in acquired images has made progress in skeleton extraction. However, most models have low modeling accuracy, poor real-time performance, and poor model availability. problem. Aiming at the above-mentioned human target detection problem, this paper uses the deep learning skeleton sequence model gesture recognition method in sports scenes to study, aiming to provide a gesture recognition method with strong noise resistance, good real-time performance and accurate model. This article uses motion video frame images to train the VGG16 network. Using the network to extract skeleton information can strengthen the posture feature expression, and use HOG for feature extraction, and use the Adam algorithm to optimize the network to extract more posture features, thereby improving the posture of the network Recognition accuracy. Then adjust the hyperparameters and network structure of the basic network according to the training results, and obtain the key poses in the sports scene through the final classifier.


2021 ◽  
Author(s):  
Muhammed Emir cakici ◽  
Feyza Yildirim Okay ◽  
Suat Ozdemir

2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Qingbo Ji ◽  
Chong Dai ◽  
Changbo Hou ◽  
Xun Li

AbstractWith the increasing application of computer vision technology in autonomous driving, robot, and other mobile devices, more and more attention has been paid to the implementation of target detection and tracking algorithms on embedded platforms. The real-time performance and robustness of algorithms are two hot research topics and challenges in this field. In order to solve the problems of poor real-time tracking performance of embedded systems using convolutional neural networks and low robustness of tracking algorithms for complex scenes, this paper proposes a fast and accurate real-time video detection and tracking algorithm suitable for embedded systems. The algorithm combines the object detection model of single-shot multibox detection in deep convolution networks and the kernel correlation filters tracking algorithm, what is more, it accelerates the single-shot multibox detection model using field-programmable gate arrays, which satisfies the real-time performance of the algorithm on the embedded platform. To solve the problem of model contamination after the kernel correlation filters algorithm fails to track in complex scenes, an improvement in the validity detection mechanism of tracking results is proposed that solves the problem of the traditional kernel correlation filters algorithm not being able to robustly track for a long time. In order to solve the problem that the missed rate of the single-shot multibox detection model is high under the conditions of motion blur or illumination variation, a strategy to reduce missed rate is proposed that effectively reduces the missed detection. The experimental results on the embedded platform show that the algorithm can achieve real-time tracking of the object in the video and can automatically reposition the object to continue tracking after the object tracking fails.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Aamir Khan ◽  
Zhang Zhijiang ◽  
Yingjie Yu ◽  
Muhammad Amir Khan ◽  
Ketao Yan ◽  
...  

Current development in a deep neural network (DNN) has given an opportunity to a novel framework for the reconstruction of a holographic image and a phase recovery method with real-time performance. There are many deep learning-based techniques that have been proposed for the holographic image reconstruction, but these deep learning-based methods can still lack in performance, time complexity, accuracy, and real-time performance. Due to iterative calculation, the generation of a CGH requires a long computation time. A novel deep generative adversarial network holography (GAN-Holo) framework is proposed for hologram reconstruction. This novel framework consists of two phases. In phase one, we used the Fresnel-based method to make the dataset. In the second phase, we trained the raw input image and holographic label image data from phase one acquired images. Our method has the capability of the noniterative process of computer-generated holograms (CGHs). The experimental results have demonstrated that the proposed method outperforms the existing methods.


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142097836
Author(s):  
Cristian Vilar ◽  
Silvia Krug ◽  
Benny Thörnberg

3D object recognition has been a cutting-edge research topic since the popularization of depth cameras. These cameras enhance the perception of the environment and so are particularly suitable for autonomous robot navigation applications. Advanced deep learning approaches for 3D object recognition are based on complex algorithms and demand powerful hardware resources. However, autonomous robots and powered wheelchairs have limited resources, which affects the implementation of these algorithms for real-time performance. We propose to use instead a 3D voxel-based extension of the 2D histogram of oriented gradients (3DVHOG) as a handcrafted object descriptor for 3D object recognition in combination with a pose normalization method for rotational invariance and a supervised object classifier. The experimental goal is to reduce the overall complexity and the system hardware requirements, and thus enable a feasible real-time hardware implementation. This article compares the 3DVHOG object recognition rates with those of other 3D recognition approaches, using the ModelNet10 object data set as a reference. We analyze the recognition accuracy for 3DVHOG using a variety of voxel grid selections, different numbers of neurons ( Nh) in the single hidden layer feedforward neural network, and feature dimensionality reduction using principal component analysis. The experimental results show that the 3DVHOG descriptor achieves a recognition accuracy of 84.91% with a total processing time of 21.4 ms. Despite the lower recognition accuracy, this is close to the current state-of-the-art approaches for deep learning while enabling real-time performance.


Sign in / Sign up

Export Citation Format

Share Document