Rotational multipyramid network with bounding‐box transformation for object detection

Author(s):  
Donggeun Kim ◽  
San Kim ◽  
Siheon Jeong ◽  
Ji‐Wan Ham ◽  
Seho Son ◽  
...  
Author(s):  
Кonstantin А. Elshin ◽  
Еlena I. Molchanova ◽  
Мarina V. Usoltseva ◽  
Yelena V. Likhoshway

Using the TensorFlow Object Detection API, an approach to identifying and registering Baikal diatom species Synedra acus subsp. radians has been tested. As a result, a set of images was formed and training was conducted. It is shown that аfter 15000 training iterations, the total value of the loss function was obtained equal to 0,04. At the same time, the classification accuracy is equal to 95%, and the accuracy of construction of the bounding box is also equal to 95%.


Author(s):  
Hui-Shen Yuan ◽  
Si-Bao Chen ◽  
Bin Luo ◽  
Hao Huang ◽  
Qiang Li

2021 ◽  
Vol 13 (22) ◽  
pp. 4517
Author(s):  
Falin Wu ◽  
Jiaqi He ◽  
Guopeng Zhou ◽  
Haolun Li ◽  
Yushuang Liu ◽  
...  

Object detection in remote sensing images plays an important role in both military and civilian remote sensing applications. Objects in remote sensing images are different from those in natural images. They have the characteristics of scale diversity, arbitrary directivity, and dense arrangement, which causes difficulties in object detection. For objects with a large aspect ratio and that are oblique and densely arranged, using an oriented bounding box can help to avoid deleting some correct detection bounding boxes by mistake. The classic rotational region convolutional neural network (R2CNN) has advantages for text detection. However, R2CNN has poor performance in the detection of slender objects with arbitrary directivity in remote sensing images, and its fault tolerance rate is low. In order to solve this problem, this paper proposes an improved R2CNN based on a double detection head structure and a three-point regression method, namely, TPR-R2CNN. The proposed network modifies the original R2CNN network structure by applying a double fully connected (2-fc) detection head and classification fusion. One detection head is for classification and horizontal bounding box regression, the other is for classification and oriented bounding box regression. The three-point regression method (TPR) is proposed for oriented bounding box regression, which determines the positions of the oriented bounding box by regressing the coordinates of the center point and the first two vertices. The proposed network was validated on the DOTA-v1.5 and HRSC2016 datasets, and it achieved a mean average precision (mAP) of 3.90% and 15.27%, respectively, from feature pyramid network (FPN) baselines with a ResNet-50 backbone.


2020 ◽  
Vol 12 (21) ◽  
pp. 3630
Author(s):  
Jin Liu ◽  
Haokun Zheng

Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely arranged. Therefore, using an oriented bounding box to mark the target is a mainstream choice. However, this general method is designed based on horizontal box annotation, while the improved method for detecting an oriented bounding box has a high computational complexity. In this paper, we propose a method called ellipse field network (EFN) to organically integrate semantic segmentation and object detection. It predicts the probability distribution of the target and obtains accurate oriented bounding boxes through a post-processing step. We tested our method on the HRSC2016 and DOTA data sets, achieving mAP values of 0.863 and 0.701, respectively. At the same time, we also tested the performance of EFN on natural images and obtained a mAP of 84.7 in the VOC2012 data set. These extensive experiments demonstrate that EFN can achieve state-of-the-art results in aerial image tests and can obtain a good score when considering natural images.


Author(s):  
Kuang-Wen Hsieh ◽  
Bo-Yu Huang ◽  
Kai-Ze Hsiao ◽  
Yu-Hao Tuan ◽  
Fu-Pang Shih ◽  
...  

AbstractThe objective of this study was to identify the maturity and position of tomatoes in greenhouse. Three parts have been included in this study: building the model of image capturing and object detection, position identification of mature fruits and prediction of the size of the mature fruits. For the first part, image capturing in different time and object detection will be conducted in the greenhouse for identification of mature fruits. For the second part, the relative 3D position of the mature fruits calculated by the binocular vision was compared with the actual measured position. For the third part, the size of the bounding box from the object detection was compared with the actual size of the mature fruit, and the correlation was calculated in order to pre-adjust the width of the gripper for plucking operation in the future. The precision and the recall of the mature fruits of this study are over 95%. The average error of the 3D position is 0.5 cm. The actual size of the fruits and the R-squared of the size of the bounding box are over 0.9.


2021 ◽  
Author(s):  
Sixian Chan ◽  
Jingcheng Zheng ◽  
Lina Wang ◽  
Tingting Wang ◽  
Xiaolong Zhou ◽  
...  

Abstract Deep learning models have become the mainstream algorithm for processing computer vision tasks. In object detection tasks, the detection box is usually set as a rectangular box aligned with the coordinate axis, so as to achieve the complete package of the object. However, when facing some objects with large aspect ratio and angle, the bounding box has to become large, which makes the bounding box contain a large amount of useless background information. In this study, a different approach is taken, using a method based on YOLOv5, the angle information dimension is increased in head part and angle regression added at the same time of the border regression, combining ciou and smoothl1 to calculate the bounding box loss, so that the resulting border box fits the actual object more closely. At the same time, the original dataset tags are also preprocessed to calculate the angle information of interest. The purpose of these improvements is to realize object detection with angles in remote-sensing images, especially for objects with large aspect ratios, such as ships, airplanes, and automobiles. Compared with the traditional object detection model based on deep learning, experimental results show that the proposed method has a unique effect in detecting rotating objects.


Sign in / Sign up

Export Citation Format

Share Document