scholarly journals Multi-Scale Vehicle Detection for Foreground-Background Class Imbalance with Improved YOLOv2

Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3336 ◽  
Author(s):  
Zhongyuan Wu ◽  
Jun Sang ◽  
Qian Zhang ◽  
Hong Xiang ◽  
Bin Cai ◽  
...  

Vehicle detection is a challenging task in computer vision. In recent years, numerous vehicle detection methods have been proposed. Since the vehicles may have varying sizes in a scene, while the vehicles and the background in a scene may be with imbalanced sizes, the performance of vehicle detection is influenced. To obtain better performance on vehicle detection, a multi-scale vehicle detection method was proposed in this paper by improving YOLOv2. The main contributions of this paper include: (1) a new anchor box generation method Rk-means++ was proposed to enhance the adaptation of varying sizes of vehicles and achieve multi-scale detection; (2) Focal Loss was introduced into YOLOv2 for vehicle detection to reduce the negative influence on training resulting from imbalance between vehicles and background. The experimental results upon the Beijing Institute of Technology (BIT)-Vehicle public dataset demonstrated that the proposed method can obtain better performance on vehicle localization and recognition than that of other existing methods.

2019 ◽  
Vol 11 (5) ◽  
pp. 531 ◽  
Author(s):  
Yuanyuan Wang ◽  
Chao Wang ◽  
Hong Zhang ◽  
Yingbo Dong ◽  
Sisi Wei

Independent of daylight and weather conditions, synthetic aperture radar (SAR) imagery is widely applied to detect ships in marine surveillance. The shapes of ships are multi-scale in SAR imagery due to multi-resolution imaging modes and their various shapes. Conventional ship detection methods are highly dependent on the statistical models of sea clutter or the extracted features, and their robustness need to be strengthened. Being an automatic learning representation, the RetinaNet object detector, one kind of deep learning model, is proposed to crack this obstacle. Firstly, feature pyramid networks (FPN) are used to extract multi-scale features for both ship classification and location. Then, focal loss is used to address the class imbalance and to increase the importance of the hard examples during training. There are 86 scenes of Chinese Gaofen-3 Imagery at four resolutions, i.e., 3 m, 5 m, 8 m, and 10 m, used to evaluate our approach. Two Gaofen-3 images and one Constellation of Small Satellite for Mediterranean basin Observation (Cosmo-SkyMed) image are used to evaluate the robustness. The experimental results reveal that (1) RetinaNet not only can efficiently detect multi-scale ships but also has a high detection accuracy; (2) compared with other object detectors, RetinaNet achieves more than a 96% mean average precision (mAP). These results demonstrate the effectiveness of our proposed method.


2021 ◽  
Vol 13 (5) ◽  
pp. 847
Author(s):  
Wei Huang ◽  
Guanyi Li ◽  
Qiqiang Chen ◽  
Ming Ju ◽  
Jiantao Qu

In the wake of developments in remote sensing, the application of target detection of remote sensing is of increasing interest. Unfortunately, unlike natural image processing, remote sensing image processing involves dealing with large variations in object size, which poses a great challenge to researchers. Although traditional multi-scale detection networks have been successful in solving problems with such large variations, they still have certain limitations: (1) The traditional multi-scale detection methods note the scale of features but ignore the correlation between feature levels. Each feature map is represented by a single layer of the backbone network, and the extracted features are not comprehensive enough. For example, the SSD network uses the features extracted from the backbone network at different scales directly for detection, resulting in the loss of a large amount of contextual information. (2) These methods combine with inherent backbone classification networks to perform detection tasks. RetinaNet is just a combination of the ResNet-101 classification network and FPN network to perform the detection tasks; however, there are differences in object classification and detection tasks. To address these issues, a cross-scale feature fusion pyramid network (CF2PN) is proposed. First and foremost, a cross-scale fusion module (CSFM) is introduced to extract sufficiently comprehensive semantic information from features for performing multi-scale fusion. Moreover, a feature pyramid for target detection utilizing thinning U-shaped modules (TUMs) performs the multi-level fusion of the features. Eventually, a focal loss in the prediction section is used to control the large number of negative samples generated during the feature fusion process. The new architecture of the network proposed in this paper is verified by DIOR and RSOD dataset. The experimental results show that the performance of this method is improved by 2–12% in the DIOR dataset and RSOD dataset compared with the current SOTA target detection methods.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1906
Author(s):  
Jia-Zheng Jian ◽  
Tzong-Rong Ger ◽  
Han-Hua Lai ◽  
Chi-Ming Ku ◽  
Chiung-An Chen ◽  
...  

Diverse computer-aided diagnosis systems based on convolutional neural networks were applied to automate the detection of myocardial infarction (MI) found in electrocardiogram (ECG) for early diagnosis and prevention. However, issues, particularly overfitting and underfitting, were not being taken into account. In other words, it is unclear whether the network structure is too simple or complex. Toward this end, the proposed models were developed by starting with the simplest structure: a multi-lead features-concatenate narrow network (N-Net) in which only two convolutional layers were included in each lead branch. Additionally, multi-scale features-concatenate networks (MSN-Net) were also implemented where larger features were being extracted through pooling the signals. The best structure was obtained via tuning both the number of filters in the convolutional layers and the number of inputting signal scales. As a result, the N-Net reached a 95.76% accuracy in the MI detection task, whereas the MSN-Net reached an accuracy of 61.82% in the MI locating task. Both networks give a higher average accuracy and a significant difference of p < 0.001 evaluated by the U test compared with the state-of-the-art. The models are also smaller in size thus are suitable to fit in wearable devices for offline monitoring. In conclusion, testing throughout the simple and complex network structure is indispensable. However, the way of dealing with the class imbalance problem and the quality of the extracted features are yet to be discussed.


AMB Express ◽  
2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Marcelo dos Santos Barbosa ◽  
Iara Beatriz Andrade de Sousa ◽  
Simone Simionatto ◽  
Sibele Borsuk ◽  
Silvana Beutinger Marchioro

AbstractCurrent prevention methods for the transmission of Mycobacterium leprae, the causative agent of leprosy, are inadequate as suggested by the rate of new leprosy cases reported. Simple large-scale detection methods for M. leprae infection are crucial for early detection of leprosy and disease control. The present study investigates the production and seroreactivity of a recombinant polypeptide composed of various M. leprae protein epitopes. The structural and physicochemical parameters of this construction were assessed using in silico tools. Parameters like subcellular localization, presence of signal peptide, primary, secondary, and tertiary structures, and 3D model were ascertained using several bioinformatics tools. The resultant purified recombinant polypeptide, designated rMLP15, is composed of 15 peptides from six selected M. leprae proteins (ML1358, ML2055, ML0885, ML1811, ML1812, and ML1214) that induce T cell reactivity in leprosy patients from different hyperendemic regions. Using rMLP15 as the antigen, sera from 24 positive patients and 14 healthy controls were evaluated for reactivity via ELISA. ELISA-rMLP15 was able to diagnose 79.17% of leprosy patients with a specificity of 92.86%. rMLP15 was also able to detect the multibacillary and paucibacillary patients in the same proportions, a desirable addition in the leprosy diagnosis. These results summarily indicate the utility of the recombinant protein rMLP15 in the diagnosis of leprosy and the future development of a viable screening test.


2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Daobin Huang ◽  
Minghui Wang ◽  
Ling Zhang ◽  
Haichun Li ◽  
Minquan Ye ◽  
...  

Abstract Background Accurately segment the tumor region of MRI images is important for brain tumor diagnosis and radiotherapy planning. At present, manual segmentation is wildly adopted in clinical and there is a strong need for an automatic and objective system to alleviate the workload of radiologists. Methods We propose a parallel multi-scale feature fusing architecture to generate rich feature representation for accurate brain tumor segmentation. It comprises two parts: (1) Feature Extraction Network (FEN) for brain tumor feature extraction at different levels and (2) Multi-scale Feature Fusing Network (MSFFN) for merge all different scale features in a parallel manner. In addition, we use two hybrid loss functions to optimize the proposed network for the class imbalance issue. Results We validate our method on BRATS 2015, with 0.86, 0.73 and 0.61 in Dice for the three tumor regions (complete, core and enhancing), and the model parameter size is only 6.3 MB. Without any post-processing operations, our method still outperforms published state-of-the-arts methods on the segmentation results of complete tumor regions and obtains competitive performance in another two regions. Conclusions The proposed parallel structure can effectively fuse multi-level features to generate rich feature representation for high-resolution results. Moreover, the hybrid loss functions can alleviate the class imbalance issue and guide the training process. The proposed method can be used in other medical segmentation tasks.


2018 ◽  
Vol 159 ◽  
pp. 742-753 ◽  
Author(s):  
Li Wenhan ◽  
Lu Kailiang ◽  
Li He ◽  
Cui Hongliang ◽  
Li Xiu

Sign in / Sign up

Export Citation Format

Share Document