scholarly journals IoU Regression with H+L-Sampling for Accurate Detection Confidence

Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4433
Author(s):  
Dong Wang ◽  
Huaming Wu

It is a common paradigm in object detection frameworks that the samples in training and testing have consistent distributions for the two main tasks: Classification and bounding box regression. This paradigm is popular in sampling strategy for training an object detector due to its intuition and practicability. For the task of localization quality estimation, there exist two ways of sampling: The same sampling with the main tasks and the uniform sampling by manually augmenting the ground-truth. The first method of sampling is simple but inconsistent for the task of quality estimation. The second method of uniform sampling contains all IoU level distributions but is more complex and difficult for training. In this paper, we propose an H+L-Sampling strategy, selecting the high and low IoU samples simultaneously, to effectively and simply train the branch of quality estimation. This strategy inherits the effectiveness of consistent sampling and reduces the training difficulty of uniform sampling. Finally, we introduce accurate detection confidence, which combines the classification probability and the localization accuracy, as the ranking keyword of NMS. Extensive experiments show the effectiveness of our method in solving the misalignment between classification confidence and localization accuracy and improving the detection performance.

2019 ◽  
Vol 11 (3) ◽  
pp. 286 ◽  
Author(s):  
Jiangqiao Yan ◽  
Hongqi Wang ◽  
Menglong Yan ◽  
Wenhui Diao ◽  
Xian Sun ◽  
...  

Recently, methods based on Faster region-based convolutional neural network (R-CNN)have been popular in multi-class object detection in remote sensing images due to their outstandingdetection performance. The methods generally propose candidate region of interests (ROIs) througha region propose network (RPN), and the regions with high enough intersection-over-union (IoU)values against ground truth are treated as positive samples for training. In this paper, we find thatthe detection result of such methods is sensitive to the adaption of different IoU thresholds. Specially,detection performance of small objects is poor when choosing a normal higher threshold, while alower threshold will result in poor location accuracy caused by a large quantity of false positives.To address the above issues, we propose a novel IoU-Adaptive Deformable R-CNN framework formulti-class object detection. Specially, by analyzing the different roles that IoU can play in differentparts of the network, we propose an IoU-guided detection framework to reduce the loss of small objectinformation during training. Besides, the IoU-based weighted loss is designed, which can learn theIoU information of positive ROIs to improve the detection accuracy effectively. Finally, the class aspectratio constrained non-maximum suppression (CARC-NMS) is proposed, which further improves theprecision of the results. Extensive experiments validate the effectiveness of our approach and weachieve state-of-the-art detection performance on the DOTA dataset.


2020 ◽  
Vol 28 (1) ◽  
pp. 81-96
Author(s):  
José Miguel Buenaposada ◽  
Luis Baumela

In recent years we have witnessed significant progress in the performance of object detection in images. This advance stems from the use of rich discriminative features produced by deep models and the adoption of new training techniques. Although these techniques have been extensively used in the mainstream deep learning-based models, it is still an open issue to analyze their impact in alternative, and computationally more efficient, ensemble-based approaches. In this paper we evaluate the impact of the adoption of data augmentation, bounding box refinement and multi-scale processing in the context of multi-class Boosting-based object detection. In our experiments we show that use of these training advancements significantly improves the object detection performance.


Author(s):  
K. Kamiya ◽  
T. Fuse ◽  
M. Takahashi

Since satellite and aerial imageries are recently widely spread and frequently observed, combination of them are expected to complement spatial and temporal resolution each other. One of the prospective applications is traffic monitoring, where objects of interest, or vehicles, need to be recognized automatically. Techniques that employ <i>object detection</i> before <i>object recognition</i> can save a computational time and cost, and thus take a significant role. However, there is not enough knowledge whether object detection method can perform well on satellite and aerial imageries. In addition, it also has to be studied how characteristics of satellite and aerial imageries affect the object detection performance. This study employ binarized normed gradients (BING) method that runs significantly fast and is robust to rotation and noise. For our experiments, 11-bits BGR-IR satellite imageries from WorldView-3, and BGR-color aerial imageries are used respectively, and we create thousands of ground truth samples. We conducted several experiments to compare the performances with different images, to verify whether combination of different resolution images improved the performance, and to analyze the applicability of mixing satellite and aerial imageries. The results showed that infrared band had little effect on the detection rate, that 11-bit images performed less than 8-bit images and that the better spatial resolution brought the better performance. Another result might imply that mixing higher and lower resolution images for training dataset could help detection performance. Furthermore, we found that aerial images improved the detection performance on satellite images.


Electronics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 76
Author(s):  
Jongsub Yu ◽  
Hyukdoo Choi

This paper presents an object detector with depth estimation using monocular camera images. Previous detection studies have typically focused on detecting objects with 2D or 3D bounding boxes. A 3D bounding box consists of the center point, its size parameters, and heading information. However, predicting complex output compositions leads a model to have generally low performances, and it is not necessary for risk assessment for autonomous driving. We focused on predicting a single depth per object, which is essential for risk assessment for autonomous driving. Our network architecture is based on YOLO v4, which is a fast and accurate one-stage object detector. We added an additional channel to the output layer for depth estimation. To train depth prediction, we extract the closest depth from the 3D bounding box coordinates of ground truth labels in the dataset. Our model is compared with the latest studies on 3D object detection using the KITTI object detection benchmark. As a result, we show that our model achieves higher detection performance and detection speed than existing models with comparable depth accuracy.


Author(s):  
K. Kamiya ◽  
T. Fuse ◽  
M. Takahashi

Since satellite and aerial imageries are recently widely spread and frequently observed, combination of them are expected to complement spatial and temporal resolution each other. One of the prospective applications is traffic monitoring, where objects of interest, or vehicles, need to be recognized automatically. Techniques that employ <i>object detection</i> before <i>object recognition</i> can save a computational time and cost, and thus take a significant role. However, there is not enough knowledge whether object detection method can perform well on satellite and aerial imageries. In addition, it also has to be studied how characteristics of satellite and aerial imageries affect the object detection performance. This study employ binarized normed gradients (BING) method that runs significantly fast and is robust to rotation and noise. For our experiments, 11-bits BGR-IR satellite imageries from WorldView-3, and BGR-color aerial imageries are used respectively, and we create thousands of ground truth samples. We conducted several experiments to compare the performances with different images, to verify whether combination of different resolution images improved the performance, and to analyze the applicability of mixing satellite and aerial imageries. The results showed that infrared band had little effect on the detection rate, that 11-bit images performed less than 8-bit images and that the better spatial resolution brought the better performance. Another result might imply that mixing higher and lower resolution images for training dataset could help detection performance. Furthermore, we found that aerial images improved the detection performance on satellite images.


Author(s):  
Кonstantin А. Elshin ◽  
Еlena I. Molchanova ◽  
Мarina V. Usoltseva ◽  
Yelena V. Likhoshway

Using the TensorFlow Object Detection API, an approach to identifying and registering Baikal diatom species Synedra acus subsp. radians has been tested. As a result, a set of images was formed and training was conducted. It is shown that аfter 15000 training iterations, the total value of the loss function was obtained equal to 0,04. At the same time, the classification accuracy is equal to 95%, and the accuracy of construction of the bounding box is also equal to 95%.


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


2021 ◽  
Vol 13 (9) ◽  
pp. 1854
Author(s):  
Syed Muhammad Arsalan Bashir ◽  
Yi Wang

This paper deals with detecting small objects in remote sensing images from satellites or any aerial vehicle by utilizing the concept of image super-resolution for image resolution enhancement using a deep-learning-based detection method. This paper provides a rationale for image super-resolution for small objects by improving the current super-resolution (SR) framework by incorporating a cyclic generative adversarial network (GAN) and residual feature aggregation (RFA) to improve detection performance. The novelty of the method is threefold: first, a framework is proposed, independent of the final object detector used in research, i.e., YOLOv3 could be replaced with Faster R-CNN or any object detector to perform object detection; second, a residual feature aggregation network was used in the generator, which significantly improved the detection performance as the RFA network detected complex features; and third, the whole network was transformed into a cyclic GAN. The image super-resolution cyclic GAN with RFA and YOLO as the detection network is termed as SRCGAN-RFA-YOLO, which is compared with the detection accuracies of other methods. Rigorous experiments on both satellite images and aerial images (ISPRS Potsdam, VAID, and Draper Satellite Image Chronology datasets) were performed, and the results showed that the detection performance increased by using super-resolution methods for spatial resolution enhancement; for an IoU of 0.10, AP of 0.7867 was achieved for a scale factor of 16.


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 678
Author(s):  
Vladimir Tadic ◽  
Tatjana Loncar-Turukalo ◽  
Akos Odry ◽  
Zeljen Trpovski ◽  
Attila Toth ◽  
...  

This note presents a fuzzy optimization of Gabor filter-based object and text detection. The derivation of a 2D Gabor filter and the guidelines for the fuzzification of the filter parameters are described. The fuzzy Gabor filter proved to be a robust text an object detection method in low-quality input images as extensively evaluated in the problem of license plate localization. The extended set of examples confirmed that the fuzzy optimized Gabor filter with adequately fuzzified parameters detected the desired license plate texture components and highly improved the object detection when compared to the classic Gabor filter. The robustness of the proposed approach was further demonstrated on other images of various origin containing text and different textures, captured using low-cost or modest quality acquisition procedures. The possibility to fine tune the fuzzification procedure to better suit certain applications offers the potential to further boost detection performance.


Sign in / Sign up

Export Citation Format

Share Document