scholarly journals Anchor Box Parameters and Bounding Box Overlap Ratios for the Faster R-CNN Detector in Detecting a Single Object by the Masking Background

2018 ◽  
Vol 21 ◽  
pp. 15-23
Author(s):  
Vadim Romanuke

Anchor box parameters and bounding box overlap ratios are studied in order to set them appropriately for the Faster R-CNN detector. The benchmark detection is based on monochrome images whose background may mask a small dark object. Three object detection tasks are generated, where every image either contains a small black square/rectangle or does not contain the object, representing thus class “background”. The ratios are recommended to be tried at 0.7 if this class is represented. The ratio for positive training samples is tried at a less value but greater than 0.4 for the task every image of which contains an object. The minimum anchor box size is better to try at a lesser value from a range of object sizes. The anchor box pyramid scale factor and the number of levels are better to try at 2 and 8, respectively. Subsequently, these parameters may be corrected as their influence is fuzzier than that of the ratios.

Author(s):  
Кonstantin А. Elshin ◽  
Еlena I. Molchanova ◽  
Мarina V. Usoltseva ◽  
Yelena V. Likhoshway

Using the TensorFlow Object Detection API, an approach to identifying and registering Baikal diatom species Synedra acus subsp. radians has been tested. As a result, a set of images was formed and training was conducted. It is shown that аfter 15000 training iterations, the total value of the loss function was obtained equal to 0,04. At the same time, the classification accuracy is equal to 95%, and the accuracy of construction of the bounding box is also equal to 95%.


Author(s):  
Donggeun Kim ◽  
San Kim ◽  
Siheon Jeong ◽  
Ji‐Wan Ham ◽  
Seho Son ◽  
...  

Author(s):  
Hui-Shen Yuan ◽  
Si-Bao Chen ◽  
Bin Luo ◽  
Hao Huang ◽  
Qiang Li

2021 ◽  
Vol 13 (22) ◽  
pp. 4517
Author(s):  
Falin Wu ◽  
Jiaqi He ◽  
Guopeng Zhou ◽  
Haolun Li ◽  
Yushuang Liu ◽  
...  

Object detection in remote sensing images plays an important role in both military and civilian remote sensing applications. Objects in remote sensing images are different from those in natural images. They have the characteristics of scale diversity, arbitrary directivity, and dense arrangement, which causes difficulties in object detection. For objects with a large aspect ratio and that are oblique and densely arranged, using an oriented bounding box can help to avoid deleting some correct detection bounding boxes by mistake. The classic rotational region convolutional neural network (R2CNN) has advantages for text detection. However, R2CNN has poor performance in the detection of slender objects with arbitrary directivity in remote sensing images, and its fault tolerance rate is low. In order to solve this problem, this paper proposes an improved R2CNN based on a double detection head structure and a three-point regression method, namely, TPR-R2CNN. The proposed network modifies the original R2CNN network structure by applying a double fully connected (2-fc) detection head and classification fusion. One detection head is for classification and horizontal bounding box regression, the other is for classification and oriented bounding box regression. The three-point regression method (TPR) is proposed for oriented bounding box regression, which determines the positions of the oriented bounding box by regressing the coordinates of the center point and the first two vertices. The proposed network was validated on the DOTA-v1.5 and HRSC2016 datasets, and it achieved a mean average precision (mAP) of 3.90% and 15.27%, respectively, from feature pyramid network (FPN) baselines with a ResNet-50 backbone.


2020 ◽  
Vol 12 (21) ◽  
pp. 3630
Author(s):  
Jin Liu ◽  
Haokun Zheng

Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely arranged. Therefore, using an oriented bounding box to mark the target is a mainstream choice. However, this general method is designed based on horizontal box annotation, while the improved method for detecting an oriented bounding box has a high computational complexity. In this paper, we propose a method called ellipse field network (EFN) to organically integrate semantic segmentation and object detection. It predicts the probability distribution of the target and obtains accurate oriented bounding boxes through a post-processing step. We tested our method on the HRSC2016 and DOTA data sets, achieving mAP values of 0.863 and 0.701, respectively. At the same time, we also tested the performance of EFN on natural images and obtained a mAP of 84.7 in the VOC2012 data set. These extensive experiments demonstrate that EFN can achieve state-of-the-art results in aerial image tests and can obtain a good score when considering natural images.


2006 ◽  
Vol 18 (6) ◽  
pp. 744-750
Author(s):  
Ryouta Nakano ◽  
◽  
Kazuhiro Hotta ◽  
Haruhisa Takahashi

This paper presents an object detection method using independent local feature extractor. Since objects are composed of a combination of characteristic parts, a good object detector could be developed if local parts specialized for a detection target are derived automatically from training samples. To do this, we use Independent Component Analysis (ICA) which decomposes a signal into independent elementary signals. We then used the basis vectors derived by ICA as independent local feature extractors specialized for a detection target. These feature extractors are applied to a candidate area, and their outputs are used in classification. However, the number of dimension of extracted independent local features is very high. To reduce the extracted independent local features efficiently, we use Higher-order Local AutoCorrelation (HLAC) features to extract the information that relates neighboring features. This may be more effective for object detection than simple independent local features. To classify detection targets and non-targets, we use a Support Vector Machine (SVM). The proposed method is applied to a car detection problem. Superior performance is obtained by comparison with Principal Component Analysis (PCA).


Author(s):  
Kuang-Wen Hsieh ◽  
Bo-Yu Huang ◽  
Kai-Ze Hsiao ◽  
Yu-Hao Tuan ◽  
Fu-Pang Shih ◽  
...  

AbstractThe objective of this study was to identify the maturity and position of tomatoes in greenhouse. Three parts have been included in this study: building the model of image capturing and object detection, position identification of mature fruits and prediction of the size of the mature fruits. For the first part, image capturing in different time and object detection will be conducted in the greenhouse for identification of mature fruits. For the second part, the relative 3D position of the mature fruits calculated by the binocular vision was compared with the actual measured position. For the third part, the size of the bounding box from the object detection was compared with the actual size of the mature fruit, and the correlation was calculated in order to pre-adjust the width of the gripper for plucking operation in the future. The precision and the recall of the mature fruits of this study are over 95%. The average error of the 3D position is 0.5 cm. The actual size of the fruits and the R-squared of the size of the bounding box are over 0.9.


Sign in / Sign up

Export Citation Format

Share Document