scholarly journals Object detection method on station logo with single shot multi-box detector

2020 ◽  
Vol 2020 (13) ◽  
pp. 607-614
Author(s):  
Fei Rong ◽  
Li Shasha ◽  
Xu Qingzheng ◽  
Liu Kun
2019 ◽  
Vol 9 (14) ◽  
pp. 2785 ◽  
Author(s):  
Yun Jiang ◽  
Tingting Peng ◽  
Ning Tan

Single Shot MultiBox Detector (SSD) has achieved good results in object detection but there are problems such as insufficient understanding of context information and loss of features in deep layers. In order to alleviate these problems, we propose a single-shot object detection network Context Perception-SSD (CP-SSD). CP-SSD promotes the network’s understanding of context information by using context information scene perception modules, so as to capture context information for objects of different scales. Deep layer feature map used semantic activation module, through self-supervised learning to adjust the context feature information and channel interdependence, and enhance useful semantic information. CP-SSD was validated on benchmark dataset PASCAL VOC 2007. The experimental results show that, compared with SSD, the mean Average Precision (mAP) of the CP-SSD detection method reaches 77.8%, which is 0.6% higher than that of SSD, and the detection effect was significantly improved in images with difficult to distinguish the object from the background.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1240
Author(s):  
Yang Liu ◽  
Hailong Su ◽  
Cao Zeng ◽  
Xiaoli Li

In complex scenes, it is a huge challenge to accurately detect motion-blurred, tiny, and dense objects in the thermal infrared images. To solve this problem, robust thermal infrared vehicle and pedestrian detection method is proposed in this paper. An important weight parameter β is first proposed to reconstruct the loss function of the feature selective anchor-free (FSAF) module in its online feature selection process, and the FSAF module is optimized to enhance the detection performance of motion-blurred objects. The proposal of parameter β provides an effective solution to the challenge of motion-blurred object detection. Then, the optimized anchor-free branches of the FSAF module are plugged into the YOLOv3 single-shot detector and work jointly with the anchor-based branches of the YOLOv3 detector in both training and inference, which efficiently improves the detection precision of the detector for tiny and dense objects. Experimental results show that the method proposed is superior to other typical thermal infrared vehicle and pedestrian detection algorithms due to 72.2% mean average precision (mAP).


2021 ◽  
Vol 11 (9) ◽  
pp. 3782
Author(s):  
Chu-Hui Lee ◽  
Chen-Wei Lin

Object detection is one of the important technologies in the field of computer vision. In the area of fashion apparel, object detection technology has various applications, such as apparel recognition, apparel detection, fashion recommendation, and online search. The recognition task is difficult for a computer because fashion apparel images have different characteristics of clothing appearance and material. Currently, fast and accurate object detection is the most important goal in this field. In this study, we proposed a two-phase fashion apparel detection method named YOLOv4-TPD (YOLOv4 Two-Phase Detection), based on the YOLOv4 algorithm, to address this challenge. The target categories for model detection were divided into the jacket, top, pants, skirt, and bag. According to the definition of inductive transfer learning, the purpose was to transfer the knowledge from the source domain to the target domain that could improve the effect of tasks in the target domain. Therefore, we used the two-phase training method to implement the transfer learning. Finally, the experimental results showed that the mAP of our model was better than the original YOLOv4 model through the two-phase transfer learning. The proposed model has multiple potential applications, such as an automatic labeling system, style retrieval, and similarity detection.


2021 ◽  
Vol 7 (4) ◽  
pp. 64
Author(s):  
Tanguy Ophoff ◽  
Cédric Gullentops ◽  
Kristof Van Beeck ◽  
Toon Goedemé

Object detection models are usually trained and evaluated on highly complicated, challenging academic datasets, which results in deep networks requiring lots of computations. However, a lot of operational use-cases consist of more constrained situations: they have a limited number of classes to be detected, less intra-class variance, less lighting and background variance, constrained or even fixed camera viewpoints, etc. In these cases, we hypothesize that smaller networks could be used without deteriorating the accuracy. However, there are multiple reasons why this does not happen in practice. Firstly, overparameterized networks tend to learn better, and secondly, transfer learning is usually used to reduce the necessary amount of training data. In this paper, we investigate how much we can reduce the computational complexity of a standard object detection network in such constrained object detection problems. As a case study, we focus on a well-known single-shot object detector, YoloV2, and combine three different techniques to reduce the computational complexity of the model without reducing its accuracy on our target dataset. To investigate the influence of the problem complexity, we compare two datasets: a prototypical academic (Pascal VOC) and a real-life operational (LWIR person detection) dataset. The three optimization steps we exploited are: swapping all the convolutions for depth-wise separable convolutions, perform pruning and use weight quantization. The results of our case study indeed substantiate our hypothesis that the more constrained a problem is, the more the network can be optimized. On the constrained operational dataset, combining these optimization techniques allowed us to reduce the computational complexity with a factor of 349, as compared to only a factor 9.8 on the academic dataset. When running a benchmark on an Nvidia Jetson AGX Xavier, our fastest model runs more than 15 times faster than the original YoloV2 model, whilst increasing the accuracy by 5% Average Precision (AP).


2021 ◽  
Author(s):  
Yu Wang ◽  
Ye Zhang ◽  
Shaohua Zhai ◽  
Hao Chen ◽  
Shaoqi Shi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document