scholarly journals Building Corner Detection in Aerial Images with Fully Convolutional Networks

Sensors ◽  
2019 ◽  
Vol 19 (8) ◽  
pp. 1915 ◽  
Author(s):  
Weigang Song ◽  
Baojiang Zhong ◽  
Xun Sun

In aerial images, corner points can be detected to describe the structural information of buildings for city modeling, geo-localization, and so on. For this specific vision task, the existing generic corner detectors perform poorly, as they are incapable of distinguishing corner points on buildings from those on other objects such as trees and shadows. Recently, fully convolutional networks (FCNs) have been developed for semantic image segmentation that are able to recognize a designated kind of object through a training process with a manually labeled dataset. Motivated by this achievement, an FCN-based approach is proposed in the present work to detect building corners in aerial images. First, a DeepLab model comprised of improved FCNs and fully-connected conditional random fields (CRFs) is trained end-to-end for building region segmentation. The segmentation is then further improved by using a morphological opening operation to increase its accuracy. Corner points are finally detected on the contour curves of building regions by using a scale-space detector. Experimental results show that the proposed building corner detection approach achieves an F-measure of 0.83 in the test image set and outperforms a number of state-of-the-art corner detectors by a large margin.

Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 563 ◽  
Author(s):  
Daliana Lobo Torres ◽  
Raul Queiroz Feitosa ◽  
Patrick Nigri Happ ◽  
Laura Elena Cué La Rosa ◽  
José Marcato Junior ◽  
...  

This study proposes and evaluates five deep fully convolutional networks (FCNs) for the semantic segmentation of a single tree species: SegNet, U-Net, FC-DenseNet, and two DeepLabv3+ variants. The performance of the FCN designs is evaluated experimentally in terms of classification accuracy and computational load. We also verify the benefits of fully connected conditional random fields (CRFs) as a post-processing step to improve the segmentation maps. The analysis is conducted on a set of images captured by an RGB camera aboard a UAV flying over an urban area. The dataset also contains a mask that indicates the occurrence of an endangered species called Dipteryx alata Vogel, also known as cumbaru, taken as the species to be identified. The experimental analysis shows the effectiveness of each design and reports average overall accuracy ranging from 88.9% to 96.7%, an F1-score between 87.0% and 96.1%, and IoU from 77.1% to 92.5%. We also realize that CRF consistently improves the performance, but at a high computational cost.


Electronics ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 583 ◽  
Author(s):  
Khang Nguyen ◽  
Nhut T. Huynh ◽  
Phat C. Nguyen ◽  
Khanh-Duy Nguyen ◽  
Nguyen D. Vo ◽  
...  

Unmanned aircraft systems or drones enable us to record or capture many scenes from the bird’s-eye view and they have been fast deployed to a wide range of practical domains, i.e., agriculture, aerial photography, fast delivery and surveillance. Object detection task is one of the core steps in understanding videos collected from the drones. However, this task is very challenging due to the unconstrained viewpoints and low resolution of captured videos. While deep-learning modern object detectors have recently achieved great success in general benchmarks, i.e., PASCAL-VOC and MS-COCO, the robustness of these detectors on aerial images captured by drones is not well studied. In this paper, we present an evaluation of state-of-the-art deep-learning detectors including Faster R-CNN (Faster Regional CNN), RFCN (Region-based Fully Convolutional Networks), SNIPER (Scale Normalization for Image Pyramids with Efficient Resampling), Single-Shot Detector (SSD), YOLO (You Only Look Once), RetinaNet, and CenterNet for the object detection in videos captured by drones. We conduct experiments on VisDrone2019 dataset which contains 96 videos with 39,988 annotated frames and provide insights into efficient object detectors for aerial images.


2019 ◽  
Vol 11 (23) ◽  
pp. 2844 ◽  
Author(s):  
Ruoyun Liu ◽  
Monika Kuffer ◽  
Claudio Persello

Along with rapid urbanization, the growth and persistence of slums is a global challenge. While remote sensing imagery is increasingly used for producing slum maps, only a few studies have analyzed their temporal dynamics. This study explores the potential of fully convolutional networks (FCNs) to analyze the temporal dynamics of small clusters of temporary slums using very high resolution (VHR) imagery in Bangalore, India. The study develops two approaches based on FCNs. The first approach uses a post-classification change detection, and the second trains FCNs to directly classify the dynamics of slums. For both approaches, the performances of 3 × 3 kernels and 5 × 5 kernels of the networks were compared. While classification results of individual years exhibit a relatively high F1-score (3 × 3 kernel) of 88.4% on average, the change accuracies are lower. The post-classification results obtained an F1-score of 53.8% and the change-detection networks obtained an F1-score of 53.7%. According to the trajectory error matrix (TEM), the post-classification results scored higher for the overall accuracy but lower for the accuracy difference of change trajectories than the change-detection networks. Although the two methods did not have significant differences in terms of accuracy, the change-detection network was less noisy. Within our study area, the areas of slums show a small overall decrease; the annual growth of slums (between 2012 and 2016) was 7173 m2, in contrast to an annual decline of 8390 m2. However, these numbers hid the spatial dynamics, which were much larger. Interestingly, areas where slums disappeared commonly changed into green areas, not into built-up areas. The proposed change-detection network provides a robust map of the locations of changes with lower confidence about the exact boundaries. This shows the potential of FCNs for detecting the dynamics of slums in VHR imagery.


Author(s):  
Ceyda Nur Ozturk ◽  
Songul Albayrak

More effective detection of corner points in three dimensional (3-D) volumetric images can be possible through expansion of Harris corner detection algorithm, which run in two dimensional (2-D) images, into third dimension. In this study, the standard algorithm of Harris that detected corner points in 2-D slices and its 3-D version were implemented in the scale-space to determine the corner points of volumetric object images. The results obtained in sample object images with 2-D and 3-D methods that used different approaches for scale-space construction were qualitatively assessed.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1983
Author(s):  
Weipeng Shi ◽  
Wenhu Qin ◽  
Zhonghua Yun ◽  
Peng Ping ◽  
Kaiyang Wu ◽  
...  

It is essential for researchers to have a proper interpretation of remote sensing images (RSIs) and precise semantic labeling of their component parts. Although FCN (Fully Convolutional Networks)-like deep convolutional network architectures have been widely applied in the perception of autonomous cars, there are still two challenges in the semantic segmentation of RSIs. The first is to identify details in high-resolution images with complex scenes and to solve the class-mismatch issues; the second is to capture the edge of objects finely without being confused by the surroundings. HRNET has the characteristics of maintaining high-resolution representation by fusing feature information with parallel multi-resolution convolution branches. We adopt HRNET as a backbone and propose to incorporate the Class-Oriented Region Attention Module (CRAM) and Class-Oriented Context Fusion Module (CCFM) to analyze the relationships between classes and patch regions and between classes and local or global pixels, respectively. Thus, the perception capability of the model for the detailed part in the aerial image can be enhanced. We leverage these modules to develop an end-to-end semantic segmentation model for aerial images and validate it on the ISPRS Potsdam and Vaihingen datasets. The experimental results show that our model improves the baseline accuracy and outperforms some commonly used CNN architectures.


2011 ◽  
Vol 22 (8) ◽  
pp. 1897-1910 ◽  
Author(s):  
Yun LIU ◽  
Zhi-Ping CAI ◽  
Ping ZHONG ◽  
Jian-Ping YIN ◽  
Jie-Ren CHENG

Sign in / Sign up

Export Citation Format

Share Document