Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism

Mingtao Guo; Donghui Xue; Peng Li; He Xu

doi:10.3390/info11120583

Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism

Information ◽

10.3390/info11120583 ◽

2020 ◽

Vol 11 (12) ◽

pp. 583

Author(s):

Mingtao Guo ◽

Donghui Xue ◽

Peng Li ◽

He Xu

Keyword(s):

Object Detection ◽

Clustering Algorithm ◽

Pedestrian Detection ◽

Attention Mechanism ◽

Complex Environments ◽

Global Features ◽

Feature Map ◽

Spatial Pyramid Pooling ◽

Institute Of Technology ◽

Spatial Pyramid

Object detection for vehicles and pedestrians is extremely difficult to achieve in autopilot applications for the Internet of vehicles, and it is a task that requires the ability to locate and identify smaller targets even in complex environments. This paper proposes a single-stage object detection network (YOLOv3-promote) for the detection of vehicles and pedestrians in complex environments in cities, which improves on the traditional You Only Look Once version 3 (YOLOv3). First, spatial pyramid pooling is used to fuse local and global features in an image to better enrich the expression ability of the feature map and to more effectively detect targets with large size differences in the image; second, an attention mechanism is added to the feature map to weight each channel, thereby enhancing key features and removing redundant features, which allows for strengthening the ability of the feature network to discriminate between target objects and backgrounds; lastly, the anchor box derived from the K-means clustering algorithm is fitted to the final prediction box to complete the positioning and identification of target vehicles and pedestrians. The experimental results show that the proposed method achieved 91.4 mAP (mean average precision), 83.2 F1 score, and 43.7 frames per second (FPS) on the KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) dataset, and the detection performance was superior to the conventional YOLOv3 algorithm in terms of both accuracy and speed.

Download Full-text

Employing GRU to combine feature maps in DeeplabV3 for a better segmentation model

Nordic Machine Intelligence ◽

10.5617/nmi.9131 ◽

2021 ◽

Vol 1 (1) ◽

pp. 29-31

Author(s):

Mahmood Haithami ◽

Amr Ahmed ◽

Iman Yi Liao ◽

Hamid Jalab

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Feature Maps ◽

Feature Map ◽

Input Feature ◽

Spatial Pyramid Pooling ◽

Test Sets ◽

Public Datasets ◽

Spatial Pyramid

In this paper, we aim to enhance the segmentation capabilities of DeeplabV3 by employing Gated Recurrent Neural Network (GRU). A 1-by-1 convolution in DeeplabV3 was replaced by GRU after the Atrous Spatial Pyramid Pooling (ASSP) layer to combine the input feature maps. The convolution and GRU have sharable parameters, though, the latter has gates that enable/disable the contribution of each input feature map. The experiments on unseen test sets demonstrate that employing GRU instead of convolution would produce better segmentation results. The used datasets are public datasets provided by MedAI competition.

Download Full-text

A Spatial Pyramid Pooling-Based Deep Convolutional Neural Network for the Classification of Electrocardiogram Beats

Applied Sciences ◽

10.3390/app8091590 ◽

2018 ◽

Vol 8 (9) ◽

pp. 1590 ◽

Cited By ~ 2

Author(s):

Jia Li ◽

Yujuan Si ◽

Liuqi Lang ◽

Lixun Liu ◽

Tao Xu

Keyword(s):

Classification Accuracy ◽

Massachusetts Institute Of Technology ◽

Deep Convolutional Neural Networks ◽

Beat Classification ◽

Spatial Pyramid Pooling ◽

Institute Of Technology ◽

Fixed Input ◽

Fully Connected ◽

Spatial Pyramid

An accurate electrocardiogram (ECG) beat classification can benefit the diagnosis of the cardiovascular disease. Deep convolutional neural networks (CNN) can automatically extract valid features from data, which is an effective way for the classification of the ECG beats. However, the fully-connected layer in CNNs requires a fixed input dimension, which limits the CNNs to receive fixed-scale inputs. Signals of different scales are generally processed into the same size by segmentation and downsampling. If information loss occurs during a uniformly-sized process, the classification accuracy will ultimately be affected. To solve this problem, this paper constructs a new CNN framework spatial pyramid pooling (SPP) method, which solves the deficiency caused by the size of input data. The Massachusetts Institute of Technology-Biotechnology (MIT-BIH) arrhythmia database is employed as the training and testing data for the classification of heartbeat signals into six categories. Compared with the traditional method, which may lose a large amount of important information and easy to be over-fitted, the robustness of the proposed method can be guaranteed by extracting data features from different sizes. Experimental results show that the proposed architecture network can extract more high-quality features and exhibits higher classification accuracy (94%) than the traditional deep CNNs (90.4%).

Download Full-text

Residual-Shuffle Network with Spatial Pyramid Pooling Module for COVID-19 Screening

Diagnostics ◽

10.3390/diagnostics11081497 ◽

2021 ◽

Vol 11 (8) ◽

pp. 1497

Author(s):

Mohd Asyraf Zulkifley ◽

Siti Raihanah Abdani ◽

Nuraisyah Hani Zulkifley ◽

Mohamad Ibrani Shahrimin

Keyword(s):

Imaging Modality ◽

Screening Method ◽

Middle Layer ◽

World Health ◽

Polymerase Chain Reaction Test ◽

Global Features ◽

X Ray ◽

The World ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Since the start of the COVID-19 pandemic at the end of 2019, more than 170 million patients have been infected with the virus that has resulted in more than 3.8 million deaths all over the world. This disease is easily spreadable from one person to another even with minimal contact, even more for the latest mutations that are more deadly than its predecessor. Hence, COVID-19 needs to be diagnosed as early as possible to minimize the risk of spreading among the community. However, the laboratory results on the approved diagnosis method by the World Health Organization, the reverse transcription-polymerase chain reaction test, takes around a day to be processed, where a longer period is observed in the developing countries. Therefore, a fast screening method that is based on existing facilities should be developed to complement this diagnosis test, so that a suspected patient can be isolated in a quarantine center. In line with this motivation, deep learning techniques were explored to provide an automated COVID-19 screening system based on X-ray imaging. This imaging modality is chosen because of its low-cost procedures that are widely available even in many small clinics. A new convolutional neural network (CNN) model is proposed instead of utilizing pre-trained networks of the existing models. The proposed network, Residual-Shuffle-Net, comprises four stacks of the residual-shuffle unit followed by a spatial pyramid pooling (SPP) unit. The architecture of the residual-shuffle unit follows an hourglass design with reduced convolution filter size in the middle layer, where a shuffle operation is performed right after the split branches have been concatenated back. Shuffle operation forces the network to learn multiple sets of features relationship across various channels instead of a set of global features. The SPP unit, which is placed at the end of the network, allows the model to learn multi-scale features that are crucial to distinguish between the COVID-19 and other types of pneumonia cases. The proposed network is benchmarked with 12 other state-of-the-art CNN models that have been designed and tuned specially for COVID-19 detection. The experimental results show that the Residual-Shuffle-Net produced the best performance in terms of accuracy and specificity metrics with 0.97390 and 0.98695, respectively. The model is also considered as a lightweight model with slightly more than 2 million parameters, which makes it suitable for mobile-based applications. For future work, an attention mechanism can be integrated to target certain regions of interest in the X-ray images that are deemed to be more informative for COVID-19 diagnosis.

Download Full-text

A2SPPNet: Attentive Atrous Spatial Pyramid Pooling Network for Salient Object Detection

IEEE Transactions on Multimedia ◽

10.1109/tmm.2022.3141933 ◽

2022 ◽

pp. 1-1

Author(s):

Yu Qiu ◽

Yun Liu ◽

Yanan Chen ◽

Jianwen Zhang ◽

Jinchao Zhu ◽

...

Keyword(s):

Object Detection ◽

Salient Object Detection ◽

Salient Object ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

Thermal Object Detection Using Yolov3 and Spatial Pyramid Pooling

Algorithms for Intelligent Systems - Proceedings of International Conference on Machine Intelligence and Data Science Applications ◽

10.1007/978-981-33-4087-9_46 ◽

2021 ◽

pp. 553-565

Author(s):

Sachin Kumar ◽

Deepak Gaur

Keyword(s):

Object Detection ◽

Spatial Pyramid Pooling ◽

Thermal Object ◽

Spatial Pyramid

Download Full-text

Object Detection through Modified YOLO Neural Network

Scientific Programming ◽

10.1155/2020/8403262 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Tanvir Ahmad ◽

Yinglong Ma ◽

Muhammad Yahya ◽

Belal Ahmad ◽

Shah Nazir ◽

...

Keyword(s):

Neural Network ◽

Object Detection ◽

Loss Function ◽

Convolution Kernel ◽

Human Beings ◽

Multiple Objects ◽

Fast Speed ◽

Spatial Pyramid Pooling ◽

Improved Model ◽

Spatial Pyramid

In the field of object detection, recently, tremendous success is achieved, but still it is a very challenging task to detect and identify objects accurately with fast speed. Human beings can detect and recognize multiple objects in images or videos with ease regardless of the object’s appearance, but for computers it is challenging to identify and distinguish between things. In this paper, a modified YOLOv1 based neural network is proposed for object detection. The new neural network model has been improved in the following ways. Firstly, modification is made to the loss function of the YOLOv1 network. The improved model replaces the margin style with proportion style. Compared to the old loss function, the new is more flexible and more reasonable in optimizing the network error. Secondly, a spatial pyramid pooling layer is added; thirdly, an inception model with a convolution kernel of 1 ∗ 1 is added, which reduced the number of weight parameters of the layers. Extensive experiments on Pascal VOC datasets 2007/2012 showed that the proposed method achieved better performance.

Download Full-text

DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection

Information Sciences ◽

10.1016/j.ins.2020.02.067 ◽

2020 ◽

Vol 522 ◽

pp. 241-258 ◽

Cited By ~ 13

Author(s):

Zhanchao Huang ◽

Jianlin Wang ◽

Xuesong Fu ◽

Tao Yu ◽

Yongqi Guo ◽

...

Keyword(s):

Object Detection ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

Backtracking Spatial Pyramid Pooling-Based Image Classifier for Weakly Supervised Top–Down Salient Object Detection

IEEE Transactions on Image Processing ◽

10.1109/tip.2018.2864891 ◽

2018 ◽

Vol 27 (12) ◽

pp. 6064-6078 ◽

Cited By ~ 6

Author(s):

Hisham Cholakkal ◽

Jubin Johnson ◽

Deepu Rajan

Keyword(s):

Object Detection ◽

Salient Object Detection ◽

Salient Object ◽

Top Down ◽

Spatial Pyramid Pooling ◽

Weakly Supervised ◽

Spatial Pyramid

Download Full-text

Efficient final output feature map processing method supporting real-time object detection and recognition

2020 International SoC Design Conference (ISOCC) ◽

10.1109/isocc50952.2020.9333051 ◽

2020 ◽

Author(s):

Seong Bin Choi ◽

Sang-Seol Lee ◽

Jonghee Park ◽

Sung-Joon Jang ◽

Byung-Ho Choi

Keyword(s):

Object Detection ◽

Real Time ◽

Processing Method ◽

Feature Map ◽

Map Processing ◽

Final Output ◽

Detection And Recognition

Download Full-text

YOLOv4 Object Detection Algorithm with Efficient Channel Attention Mechanism

2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE) ◽

10.1109/icmcce51767.2020.00387 ◽

2020 ◽

Author(s):

Cui Gao ◽

Qiang Cai ◽

Shaofeng Ming

Keyword(s):

Object Detection ◽

Detection Algorithm ◽

Attention Mechanism

Download Full-text