scholarly journals Mapping Utility Poles in Aerial Orthoimages Using ATSS Deep Learning Method

Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6070
Author(s):  
Matheus Gomes ◽  
Jonathan Silva ◽  
Diogo Gonçalves ◽  
Pedro Zamboni ◽  
Jader Perez ◽  
...  

Mapping utility poles using side-view images acquired with car-mounted cameras is a time-consuming task, mainly in larger areas due to the need for street-by-street surveying. Aerial images cover larger areas and can be feasible alternatives although the detection and mapping of the utility poles in urban environments using top-view images is challenging. Thus, we propose the use of Adaptive Training Sample Selection (ATSS) for detecting utility poles in urban areas since it is a novel method and has not yet investigated in remote sensing applications. Here, we compared ATSS with Faster Region-based Convolutional Neural Networks (Faster R-CNN) and Focal Loss for Dense Object Detection (RetinaNet ), currently used in remote sensing applications, to assess the performance of the proposed methodology. We used 99,473 patches of 256 × 256 pixels with ground sample distance (GSD) of 10 cm. The patches were divided into training, validation and test datasets in approximate proportions of 60%, 20% and 20%, respectively. As the utility pole labels are point coordinates and the object detection methods require a bounding box, we assessed the influence of the bounding box size on the ATSS method by varying the dimensions from 30×30 to 70×70 pixels. For the proposal task, our findings show that ATSS is, on average, 5% more accurate than Faster R-CNN and RetinaNet. For a bounding box size of 40×40, we achieved Average Precision with intersection over union of 50% (AP50) of 0.913 for ATSS, 0.875 for Faster R-CNN and 0.874 for RetinaNet. Regarding the influence of the bounding box size on ATSS, our results indicate that the AP50 is about 6.5% higher for 60×60 compared to 30×30. For AP75, this margin reaches 23.1% in favor of the 60×60 bounding box size. In terms of computational costs, all the methods tested remain at the same level, with an average processing time around of 0.048 s per patch. Our findings show that ATSS outperforms other methodologies and is suitable for developing operation tools that can automatically detect and map utility poles.

2021 ◽  
Vol 13 (22) ◽  
pp. 4517
Author(s):  
Falin Wu ◽  
Jiaqi He ◽  
Guopeng Zhou ◽  
Haolun Li ◽  
Yushuang Liu ◽  
...  

Object detection in remote sensing images plays an important role in both military and civilian remote sensing applications. Objects in remote sensing images are different from those in natural images. They have the characteristics of scale diversity, arbitrary directivity, and dense arrangement, which causes difficulties in object detection. For objects with a large aspect ratio and that are oblique and densely arranged, using an oriented bounding box can help to avoid deleting some correct detection bounding boxes by mistake. The classic rotational region convolutional neural network (R2CNN) has advantages for text detection. However, R2CNN has poor performance in the detection of slender objects with arbitrary directivity in remote sensing images, and its fault tolerance rate is low. In order to solve this problem, this paper proposes an improved R2CNN based on a double detection head structure and a three-point regression method, namely, TPR-R2CNN. The proposed network modifies the original R2CNN network structure by applying a double fully connected (2-fc) detection head and classification fusion. One detection head is for classification and horizontal bounding box regression, the other is for classification and oriented bounding box regression. The three-point regression method (TPR) is proposed for oriented bounding box regression, which determines the positions of the oriented bounding box by regressing the coordinates of the center point and the first two vertices. The proposed network was validated on the DOTA-v1.5 and HRSC2016 datasets, and it achieved a mean average precision (mAP) of 3.90% and 15.27%, respectively, from feature pyramid network (FPN) baselines with a ResNet-50 backbone.


2021 ◽  
Vol 11 (11) ◽  
pp. 4878
Author(s):  
Ivan Racetin ◽  
Andrija Krtalić

Hyperspectral sensors are passive instruments that record reflected electromagnetic radiation in tens or hundreds of narrow and consecutive spectral bands. In the last two decades, the availability of hyperspectral data has sharply increased, propelling the development of a plethora of hyperspectral classification and target detection algorithms. Anomaly detection methods in hyperspectral images refer to a class of target detection methods that do not require any a-priori knowledge about a hyperspectral scene or target spectrum. They are unsupervised learning techniques that automatically discover rare features on hyperspectral images. This review paper is organized into two parts: part A provides a bibliographic analysis of hyperspectral image processing for anomaly detection in remote sensing applications. Development of the subject field is discussed, and key authors and journals are highlighted. In part B an overview of the topic is presented, starting from the mathematical framework for anomaly detection. The anomaly detection methods were generally categorized as techniques that implement structured or unstructured background models and then organized into appropriate sub-categories. Specific anomaly detection methods are presented with corresponding detection statistics, and their properties are discussed. This paper represents the first review regarding hyperspectral image processing for anomaly detection in remote sensing applications.


2021 ◽  
Vol 13 (21) ◽  
pp. 4291
Author(s):  
Luyang Zhang ◽  
Haitao Wang ◽  
Lingfeng Wang ◽  
Chunhong Pan ◽  
Qiang Liu ◽  
...  

Rotated object detection is an extension of object detection that uses an oriented bounding box instead of a general horizontal bounding box to define the object position. It is widely used in remote sensing images, scene text, and license plate recognition. The existing rotated object detection methods usually add an angle prediction channel in the bounding box prediction branch, and smooth L1 loss is used as the regression loss function. However, we argue that smooth L1 loss causes a sudden change in loss and slow convergence due to the angle solving mechanism of openCV (the angle between the horizontal line and the first side of the bounding box in the counter-clockwise direction is defined as the rotation angle), and this problem exists in most existing regression loss functions. To solve the above problems, we propose a decoupling modulation mechanism to overcome the problem of sudden changes in loss. On this basis, we also proposed a constraint mechanism, the purpose of which is to accelerate the convergence of the network and ensure optimization toward the ideal direction. In addition, the proposed decoupling modulation mechanism and constraint mechanism can be integrated into the popular regression loss function individually or together, which further improves the performance of the model and makes the model converge faster. The experimental results show that our method achieves 75.2% performance on the aerial image dataset DOTA (OBB task), and saves more than 30% of computing resources. The method also achieves a state-of-the-art performance in HRSC2016, and saved more than 40% of computing resources, which confirms the applicability of the approach.


2021 ◽  
Vol 13 (24) ◽  
pp. 4962
Author(s):  
Maximilian Bernhard ◽  
Matthias Schubert

Object detection on aerial and satellite imagery is an important tool for image analysis in remote sensing and has many areas of application. As modern object detectors require accurate annotations for training, manual and labor-intensive labeling is necessary. In situations where GPS coordinates for the objects of interest are already available, there is potential to avoid the cumbersome annotation process. Unfortunately, GPS coordinates are often not well-aligned with georectified imagery. These spatial errors can be seen as noise regarding the object locations, which may critically harm the training of object detectors and, ultimately, limit their practical applicability. To overcome this issue, we propose a co-correction technique that allows us to robustly train a neural network with noisy object locations and to transform them toward the true locations. When applied as a preprocessing step on noisy annotations, our method greatly improves the performance of existing object detectors. Our method is applicable in scenarios where the images are only annotated with points roughly indicating object locations, instead of entire bounding boxes providing precise information on the object locations and extents. We test our method on three datasets and achieve a substantial improvement (e.g., 29.6% mAP on the COWC dataset) over existing methods for noise-robust object detection.


2020 ◽  
Vol 16 (3) ◽  
pp. 227-243
Author(s):  
Shahid Karim ◽  
Ye Zhang ◽  
Shoulin Yin ◽  
Irfana Bibi ◽  
Ali Anwar Brohi

Traditional object detection algorithms and strategies are difficult to meet the requirements of data processing efficiency, performance, speed and intelligence in object detection. Through the study and imitation of the cognitive ability of the brain, deep learning can analyze and process the data features. It has a strong ability of visualization and becomes the mainstream algorithm of current object detection applications. Firstly, we have discussed the developments of traditional object detection methods. Secondly, the frameworks of object detection (e.g. Region-based CNN (R-CNN), Spatial Pyramid Pooling Network (SPP-NET), Fast-RCNN and Faster-RCNN) which combine region proposals and convolutional neural networks (CNNs) are briefly characterized for optical remote sensing applications. You only look once (YOLO) algorithm is the representative of the object detection frameworks (e.g. YOLO and Single Shot MultiBox Detector (SSD)) which transforms the object detection into a regression problem. The limitations of remote sensing images and object detectors have been highlighted and discussed. The feasibility and limitations of these approaches will lead researchers to prudently select appropriate image enhancements. Finally, the problems of object detection algorithms in deep learning are summarized and the future recommendations are also conferred.


2021 ◽  
Vol 13 (7) ◽  
pp. 1246
Author(s):  
Kyle B. Larson ◽  
Aaron R. Tuor

Cheatgrass (Bromus tectorum) invasion is driving an emerging cycle of increased fire frequency and irreversible loss of wildlife habitat in the western US. Yet, detailed spatial information about its occurrence is still lacking for much of its presumably invaded range. Deep learning (DL) has demonstrated success for remote sensing applications but is less tested on more challenging tasks like identifying biological invasions using sub-pixel phenomena. We compare two DL architectures and the more conventional Random Forest and Logistic Regression methods to improve upon a previous effort to map cheatgrass occurrence at >2% canopy cover. High-dimensional sets of biophysical, MODIS, and Landsat-7 ETM+ predictor variables are also compared to evaluate different multi-modal data strategies. All model configurations improved results relative to the case study and accuracy generally improved by combining data from both sensors with biophysical data. Cheatgrass occurrence is mapped at 30 m ground sample distance (GSD) with an estimated 78.1% accuracy, compared to 250-m GSD and 71% map accuracy in the case study. Furthermore, DL is shown to be competitive with well-established machine learning methods in a limited data regime, suggesting it can be an effective tool for mapping biological invasions and more broadly for multi-modal remote sensing applications.


2015 ◽  
Vol 2015 ◽  
pp. 1-11
Author(s):  
Pengwei Li ◽  
Wenying Ge

Shadows limit many remote sensing applications such as classification, target detection, and change detection. Most current shadow detection methods utilize the histogram threshold of spectral characteristics to distinguish the shadows and nonshadows directly, called “hard binary shadow.” Obviously, the performance of threshold-based methods heavily rely on the selected threshold. Simultaneously, these threshold-based methods do not take any spatial information into account. To overcome these shortcomings, a soft shadow description method is developed by introducing the concept of opacity into shadow detection, and MRF-based shadow detection method is proposed in order to make use of neighborhood information. Experiments on remote sensing images have shown that the proposed method can obtain more accurate detection results.


Sign in / Sign up

Export Citation Format

Share Document