scholarly journals Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism

Information ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 583
Author(s):  
Mingtao Guo ◽  
Donghui Xue ◽  
Peng Li ◽  
He Xu

Object detection for vehicles and pedestrians is extremely difficult to achieve in autopilot applications for the Internet of vehicles, and it is a task that requires the ability to locate and identify smaller targets even in complex environments. This paper proposes a single-stage object detection network (YOLOv3-promote) for the detection of vehicles and pedestrians in complex environments in cities, which improves on the traditional You Only Look Once version 3 (YOLOv3). First, spatial pyramid pooling is used to fuse local and global features in an image to better enrich the expression ability of the feature map and to more effectively detect targets with large size differences in the image; second, an attention mechanism is added to the feature map to weight each channel, thereby enhancing key features and removing redundant features, which allows for strengthening the ability of the feature network to discriminate between target objects and backgrounds; lastly, the anchor box derived from the K-means clustering algorithm is fitted to the final prediction box to complete the positioning and identification of target vehicles and pedestrians. The experimental results show that the proposed method achieved 91.4 mAP (mean average precision), 83.2 F1 score, and 43.7 frames per second (FPS) on the KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) dataset, and the detection performance was superior to the conventional YOLOv3 algorithm in terms of both accuracy and speed.

2021 ◽  
Vol 1 (1) ◽  
pp. 29-31
Author(s):  
Mahmood Haithami ◽  
Amr Ahmed ◽  
Iman Yi Liao ◽  
Hamid Jalab

In this paper, we aim to enhance the segmentation capabilities of DeeplabV3 by employing Gated Recurrent Neural Network (GRU). A 1-by-1 convolution in DeeplabV3 was replaced by GRU after the Atrous Spatial Pyramid Pooling (ASSP) layer to combine the input feature maps. The convolution and GRU have sharable parameters, though, the latter has gates that enable/disable the contribution of each input feature map. The experiments on unseen test sets demonstrate that employing GRU instead of convolution would produce better segmentation results. The used datasets are public datasets provided by MedAI competition.


2018 ◽  
Vol 8 (9) ◽  
pp. 1590 ◽  
Author(s):  
Jia Li ◽  
Yujuan Si ◽  
Liuqi Lang ◽  
Lixun Liu ◽  
Tao Xu

An accurate electrocardiogram (ECG) beat classification can benefit the diagnosis of the cardiovascular disease. Deep convolutional neural networks (CNN) can automatically extract valid features from data, which is an effective way for the classification of the ECG beats. However, the fully-connected layer in CNNs requires a fixed input dimension, which limits the CNNs to receive fixed-scale inputs. Signals of different scales are generally processed into the same size by segmentation and downsampling. If information loss occurs during a uniformly-sized process, the classification accuracy will ultimately be affected. To solve this problem, this paper constructs a new CNN framework spatial pyramid pooling (SPP) method, which solves the deficiency caused by the size of input data. The Massachusetts Institute of Technology-Biotechnology (MIT-BIH) arrhythmia database is employed as the training and testing data for the classification of heartbeat signals into six categories. Compared with the traditional method, which may lose a large amount of important information and easy to be over-fitted, the robustness of the proposed method can be guaranteed by extracting data features from different sizes. Experimental results show that the proposed architecture network can extract more high-quality features and exhibits higher classification accuracy (94%) than the traditional deep CNNs (90.4%).


Diagnostics ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 1497
Author(s):  
Mohd Asyraf Zulkifley ◽  
Siti Raihanah Abdani ◽  
Nuraisyah Hani Zulkifley ◽  
Mohamad Ibrani Shahrimin

Since the start of the COVID-19 pandemic at the end of 2019, more than 170 million patients have been infected with the virus that has resulted in more than 3.8 million deaths all over the world. This disease is easily spreadable from one person to another even with minimal contact, even more for the latest mutations that are more deadly than its predecessor. Hence, COVID-19 needs to be diagnosed as early as possible to minimize the risk of spreading among the community. However, the laboratory results on the approved diagnosis method by the World Health Organization, the reverse transcription-polymerase chain reaction test, takes around a day to be processed, where a longer period is observed in the developing countries. Therefore, a fast screening method that is based on existing facilities should be developed to complement this diagnosis test, so that a suspected patient can be isolated in a quarantine center. In line with this motivation, deep learning techniques were explored to provide an automated COVID-19 screening system based on X-ray imaging. This imaging modality is chosen because of its low-cost procedures that are widely available even in many small clinics. A new convolutional neural network (CNN) model is proposed instead of utilizing pre-trained networks of the existing models. The proposed network, Residual-Shuffle-Net, comprises four stacks of the residual-shuffle unit followed by a spatial pyramid pooling (SPP) unit. The architecture of the residual-shuffle unit follows an hourglass design with reduced convolution filter size in the middle layer, where a shuffle operation is performed right after the split branches have been concatenated back. Shuffle operation forces the network to learn multiple sets of features relationship across various channels instead of a set of global features. The SPP unit, which is placed at the end of the network, allows the model to learn multi-scale features that are crucial to distinguish between the COVID-19 and other types of pneumonia cases. The proposed network is benchmarked with 12 other state-of-the-art CNN models that have been designed and tuned specially for COVID-19 detection. The experimental results show that the Residual-Shuffle-Net produced the best performance in terms of accuracy and specificity metrics with 0.97390 and 0.98695, respectively. The model is also considered as a lightweight model with slightly more than 2 million parameters, which makes it suitable for mobile-based applications. For future work, an attention mechanism can be integrated to target certain regions of interest in the X-ray images that are deemed to be more informative for COVID-19 diagnosis.


2022 ◽  
pp. 1-1
Author(s):  
Yu Qiu ◽  
Yun Liu ◽  
Yanan Chen ◽  
Jianwen Zhang ◽  
Jinchao Zhu ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Tanvir Ahmad ◽  
Yinglong Ma ◽  
Muhammad Yahya ◽  
Belal Ahmad ◽  
Shah Nazir ◽  
...  

In the field of object detection, recently, tremendous success is achieved, but still it is a very challenging task to detect and identify objects accurately with fast speed. Human beings can detect and recognize multiple objects in images or videos with ease regardless of the object’s appearance, but for computers it is challenging to identify and distinguish between things. In this paper, a modified YOLOv1 based neural network is proposed for object detection. The new neural network model has been improved in the following ways. Firstly, modification is made to the loss function of the YOLOv1 network. The improved model replaces the margin style with proportion style. Compared to the old loss function, the new is more flexible and more reasonable in optimizing the network error. Secondly, a spatial pyramid pooling layer is added; thirdly, an inception model with a convolution kernel of 1 ∗ 1 is added, which reduced the number of weight parameters of the layers. Extensive experiments on Pascal VOC datasets 2007/2012 showed that the proposed method achieved better performance.


2020 ◽  
Vol 522 ◽  
pp. 241-258 ◽  
Author(s):  
Zhanchao Huang ◽  
Jianlin Wang ◽  
Xuesong Fu ◽  
Tao Yu ◽  
Yongqi Guo ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document