scholarly journals CP-SSD: Context Information Scene Perception Object Detection Based on SSD

2019 ◽  
Vol 9 (14) ◽  
pp. 2785 ◽  
Author(s):  
Yun Jiang ◽  
Tingting Peng ◽  
Ning Tan

Single Shot MultiBox Detector (SSD) has achieved good results in object detection but there are problems such as insufficient understanding of context information and loss of features in deep layers. In order to alleviate these problems, we propose a single-shot object detection network Context Perception-SSD (CP-SSD). CP-SSD promotes the network’s understanding of context information by using context information scene perception modules, so as to capture context information for objects of different scales. Deep layer feature map used semantic activation module, through self-supervised learning to adjust the context feature information and channel interdependence, and enhance useful semantic information. CP-SSD was validated on benchmark dataset PASCAL VOC 2007. The experimental results show that, compared with SSD, the mean Average Precision (mAP) of the CP-SSD detection method reaches 77.8%, which is 0.6% higher than that of SSD, and the detection effect was significantly improved in images with difficult to distinguish the object from the background.

Author(s):  
Aofeng Li ◽  
Xufang Zhu ◽  
Shuo He ◽  
Jiawei Xia

AbstractIn view of the deficiencies in traditional visual water surface object detection, such as the existence of non-detection zones, failure to acquire global information, and deficiencies in a single-shot multibox detector (SSD) object detection algorithm such as remote detection and low detection precision of small objects, this study proposes a water surface object detection algorithm from panoramic vision based on an improved SSD. We reconstruct the backbone network for the SSD algorithm, replace VVG16 with a ResNet-50 network, and add five layers of feature extraction. More abundant semantic information of the shallow feature graph is obtained through a feature pyramid network structure with deconvolution. An experiment is conducted by building a water surface object dataset. Results showed the mean Average Precision (mAP) of the improved algorithm are increased by 4.03%, compared with the existing SSD detecting Algorithm. Improved algorithm can effectively improve the overall detection precision of water surface objects and enhance the detection effect of remote objects.


2019 ◽  
Vol 9 (9) ◽  
pp. 1829 ◽  
Author(s):  
Jie Jiang ◽  
Hui Xu ◽  
Shichang Zhang ◽  
Yujie Fang

This study proposes a multiheaded object detection algorithm referred to as MANet. The main purpose of the study is to integrate feature layers of different scales based on the attention mechanism and to enhance contextual connections. To achieve this, we first replaced the feed-forward base network of the single-shot detector with the ResNet–101 (inspired by the Deconvolutional Single-Shot Detector) and then applied linear interpolation and the attention mechanism. The information of the feature layers at different scales was fused to improve the accuracy of target detection. The primary contributions of this study are the propositions of (a) a fusion attention mechanism, and (b) a multiheaded attention fusion method. Our final MANet detector model effectively unifies the feature information among the feature layers at different scales, thus enabling it to detect objects with different sizes and with higher precision. We used the 512 × 512 input MANet (the backbone is ResNet–101) to obtain a mean accuracy of 82.7% based on the PASCAL visual object class 2007 test. These results demonstrated that our proposed method yielded better accuracy than those provided by the conventional Single-shot detector (SSD) and other advanced detectors.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6530
Author(s):  
Ruihong Yin ◽  
Wei Zhao ◽  
Xudong Fan ◽  
Yongfeng Yin

There are a large number of studies on geospatial object detection. However, many existing methods only focus on either accuracy or speed. Methods with both fast speed and high accuracy are of great importance in some scenes, like search and rescue, and military information acquisition. In remote sensing images, there are some targets that are small and have few textures and low contrast compared with the background, which impose challenges on object detection. In this paper, we propose an accurate and fast single shot detector (AF-SSD) for high spatial remote sensing imagery to solve these problems. Firstly, we design a lightweight backbone to reduce the number of trainable parameters of the network. In this lightweight backbone, we also use some wide and deep convolutional blocks to extract more semantic information and keep the high detection precision. Secondly, a novel encoding–decoding module is employed to detect small targets accurately. With up-sampling and summation operations, the encoding–decoding module can add strong high-level semantic information to low-level features. Thirdly, we design a cascade structure with spatial and channel attention modules for targets with low contrast (named low-contrast targets) and few textures (named few-texture targets). The spatial attention module can extract long-range features for few-texture targets. By weighting each channel of a feature map, the channel attention module can guide the network to concentrate on easily identifiable features for low-contrast and few-texture targets. The experimental results on the NWPU VHR-10 dataset show that our proposed AF-SSD achieves superior detection performance: parameters 5.7 M, mAP 88.7%, and 0.035 s per image on average on an NVIDIA GTX-1080Ti GPU.


2021 ◽  
Author(s):  
Lu Tan ◽  
Tianran Huangfu ◽  
Liyao Wu ◽  
Wenying Chen

Abstract Background: The correct identification of pills is very important to ensure the safe administration of drugs to patients. We used three currently mainstream object detection models, respectively Faster R-CNN, Single Shot Multi-Box Detector (SSD), and You Only Look Once v3(YOLO v3), to identify pills and compare the associated performance.Methods: In this paper, we introduce the basic principles of three object detection models. We trained each algorithm on a pill image dataset and analyzed the performance of the three models to determine the best pill recognition model. Finally, these models are then used to detect difficult samples and compare the results.Results: The mean average precision (MAP) of Faster R-CNN reached 87.69% but YOLO v3 had a significant advantage in detection speed where the frames per second (FPS) was more than eight times than that of Faster R-CNN. This means that YOLO v3 can operate in real time with a high MAP of 80.17%. The YOLO v3 algorithm also performed better in the comparison of difficult sample detection results. In contrast, SSD did not achieve the highest score in terms of MAP or FPS.Conclusion: Our study shows that YOLO v3 has advantages in detection speed while maintaining certain MAP and thus can be applied for real-time pill identification in a hospital pharmacy environment.


Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3630 ◽  
Author(s):  
Young-Joon Hwang ◽  
Jin-Gu Lee ◽  
Un-Chul Moon ◽  
Ho-Hyun Park

The single shot multi-box detector (SSD) exhibits low accuracy in small-object detection; this is because it does not consider the scale contextual information between its layers, and the shallow layers lack adequate semantic information. To improve the accuracy of the original SSD, this paper proposes a new single shot multi-box detector using trident feature and squeeze and extraction feature fusion (SSD-TSEFFM); this detector employs the trident network and the squeeze and excitation feature fusion module. Furthermore, a trident feature module (TFM) is developed, inspired by the trident network, to consider the scale contextual information. The use of this module makes the proposed model robust to scale changes owing to the application of dilated convolution. Further, the squeeze and excitation block feature fusion module (SEFFM) is used to provide more semantic information to the model. The SSD-TSEFFM is compared with the faster regions with convolution neural network features (RCNN) (2015), SSD (2016), and DF-SSD (2020) on the PASCAL VOC 2007 and 2012 datasets. The experimental results demonstrate the high accuracy of the proposed model in small-object detection, in addition to a good overall accuracy. The SSD-TSEFFM achieved 80.4% mAP and 80.2% mAP on the 2007 and 2012 datasets, respectively. This indicates an average improvement of approximately 2% over other models.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Jinling Li ◽  
Qingshan Hou ◽  
Jinsheng Xing

Multiobject detection tasks in complex scenes have become an important research topic, which is the basis of other computer vision tasks. Considering the defects of the traditional single shot multibox detector (SSD) algorithm, such as poor small object detection effect, reliance on manual setting for default box generation, and insufficient semantic information of the low detection layer, the detection effect in complex scenes was not ideal. Aiming at the shortcomings of the SSD algorithm, an improved algorithm based on the adaptive default box mechanism (ADB) is proposed. The algorithm introduces the adaptive default box mechanism, which can improve the imbalance of positive and negative samples and avoid manually set default box super parameters. Experimental results show that, compared with the traditional SSD algorithm, the improved algorithm has a better detection effect and higher accuracy in complex scenes.


2020 ◽  
Vol 2020 (13) ◽  
pp. 607-614
Author(s):  
Fei Rong ◽  
Li Shasha ◽  
Xu Qingzheng ◽  
Liu Kun

2019 ◽  
Vol 9 (22) ◽  
pp. 4836 ◽  
Author(s):  
Wenxu Shi ◽  
Jinhong Jiang ◽  
Shengli Bao ◽  
Dailun Tan

The ability to detect small targets and the speed of the target detector are very important for the application of remote sensing image detection, and in this paper, we propose an effective and efficient method (named CISPNet) with high detection accuracy and compact architecture. In particular, according to the characteristics of the data, we apply a context information scene perception (CISP) module to obtain the contextual information for targets of different scales and use k-means clustering to set the aspect ratios and size of the default boxes. The proposed method inherits the network structure of Single Shot MultiBox Detector (SSD) and introduces the CISP module into it. We create a dataset in the Pascal Visual Object Classes (VOC) format, annotated with the three types of detection targets, aircraft, ship, and oiltanker. Experimental results on our remote sensing image dataset as well as the Northwestern Polytechnical University very-high-resolution (NWPU VRH-10) dataset demonstrate that the proposed CISPNet performs much better than the original SSD and other detectors especially for small objects. Specifically, our network can achieve 80.34% mean average precision (mAP) at the speed of 50.7 frames per second (FPS) with the input size 300 × 300 pixels on the remote sensing image dataset. On extended experiments, the performance of CISPNet in fuzzy target detection in remote sensing image is better than that of SSD.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Xiaoguo Zhang ◽  
Ye Gao ◽  
Fei Ye ◽  
Qihan Liu ◽  
Kaixin Zhang

SSD (Single Shot MultiBox Detector) is one of the best object detection algorithms and is able to provide high accurate object detection performance in real time. However, SSD shows relatively poor performance on small object detection because its shallow prediction layer, which is responsible for detecting small objects, lacks enough semantic information. To overcome this problem, SKIPSSD, an improved SSD with a novel skip connection of multiscale feature maps, is proposed in this paper to enhance the semantic information and the details of the prediction layers through skippingly fusing high-level and low-level feature maps. For the detail of the fusion methods, we design two feature fusion modules and multiple fusion strategies to improve the SSD detector’s sensitivity and perception ability. Experimental results on the PASCAL VOC2007 test set demonstrate that SKIPSSD significantly improves the detection performance and outperforms lots of state-of-the-art object detectors. With an input size of 300 × 300, SKIPSSD achieves 79.0% mAP (mean average precision) at 38.7 FPS (frame per second) on a single 1080 GPU, 1.8% higher than the mAP of SSD while still keeping the real-time detection speed.


Sign in / Sign up

Export Citation Format

Share Document