Patch Matching and Dense CRF-Based Co-Refinement for Building Change Detection from Bi-Temporal Aerial Images

Jinqi Gong; Xiangyun Hu; Shiyan Pang; Kun Li

doi:10.3390/s19071557

Patch Matching and Dense CRF-Based Co-Refinement for Building Change Detection from Bi-Temporal Aerial Images

Sensors ◽

10.3390/s19071557 ◽

2019 ◽

Vol 19 (7) ◽

pp. 1557 ◽

Cited By ~ 3

Author(s):

Jinqi Gong ◽

Xiangyun Hu ◽

Shiyan Pang ◽

Kun Li

Keyword(s):

Composite Structures ◽

Feature Detection ◽

Conditional Random Field ◽

Semantic Segmentation ◽

Aerial Images ◽

Feature Descriptor ◽

Phase Congruency ◽

Edge Information ◽

Urban Scenes ◽

Remotely Sensed Imagery

The identification and monitoring of buildings from remotely sensed imagery are of considerable value for urbanization monitoring. Two outstanding issues in the detection of changes in buildings with composite structures and relief displacements are heterogeneous appearances and positional inconsistencies. In this paper, a novel patch-based matching approach is developed using densely connected conditional random field (CRF) optimization to detect building changes from bi-temporal aerial images. First, the bi-temporal aerial images are combined to obtain change information using an object-oriented technique, and then semantic segmentation based on a deep convolutional neural network is used to extract building areas. With the change information and extracted buildings, a graph-cuts-based segmentation algorithm is applied to generate the bi-temporal changed building proposals. Next, in the bi-temporal changed building proposals, corner and edge information are integrated for feature detection through a phase congruency (PC) model, and the structural feature descriptor, called the histogram of orientated PC, is used to perform patch-based roof matching. We determined the final change in buildings by gathering matched roof and bi-temporal changed building proposals using co-refinement based on CRF, which were further classified as “newly built,” “demolished”, or “changed”. Experiments were conducted with two typical datasets covering complex urban scenes with diverse building types. The results confirm the effectiveness and generality of the proposed algorithm, with more than 85% and 90% in overall accuracy and completeness, respectively.

Download Full-text

Multi-Object Segmentation in Complex Urban Scenes from High-Resolution Remote Sensing Data

Remote Sensing ◽

10.3390/rs13183710 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3710

Author(s):

Abolfazl Abdollahi ◽

Biswajeet Pradhan ◽

Nagesh Shukla ◽

Subrata Chakraborty ◽

Abdullah Alamri

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Urban Areas ◽

Object Segmentation ◽

Remote Sensing Data ◽

Semantic Segmentation ◽

Aerial Images ◽

Urban Scenes ◽

Sensing Data ◽

Boundary Information

Terrestrial features extraction, such as roads and buildings from aerial images using an automatic system, has many usages in an extensive range of fields, including disaster management, change detection, land cover assessment, and urban planning. This task is commonly tough because of complex scenes, such as urban scenes, where buildings and road objects are surrounded by shadows, vehicles, trees, etc., which appear in heterogeneous forms with lower inter-class and higher intra-class contrasts. Moreover, such extraction is time-consuming and expensive to perform by human specialists manually. Deep convolutional models have displayed considerable performance for feature segmentation from remote sensing data in the recent years. However, for the large and continuous area of obstructions, most of these techniques still cannot detect road and building well. Hence, this work’s principal goal is to introduce two novel deep convolutional models based on UNet family for multi-object segmentation, such as roads and buildings from aerial imagery. We focused on buildings and road networks because these objects constitute a huge part of the urban areas. The presented models are called multi-level context gating UNet (MCG-UNet) and bi-directional ConvLSTM UNet model (BCL-UNet). The proposed methods have the same advantages as the UNet model, the mechanism of densely connected convolutions, bi-directional ConvLSTM, and squeeze and excitation module to produce the segmentation maps with a high resolution and maintain the boundary information even under complicated backgrounds. Additionally, we implemented a basic efficient loss function called boundary-aware loss (BAL) that allowed a network to concentrate on hard semantic segmentation regions, such as overlapping areas, small objects, sophisticated objects, and boundaries of objects, and produce high-quality segmentation maps. The presented networks were tested on the Massachusetts building and road datasets. The MCG-UNet improved the average F1 accuracy by 1.85%, and 1.19% and 6.67% and 5.11% compared with UNet and BCL-UNet for road and building extraction, respectively. Additionally, the presented MCG-UNet and BCL-UNet networks were compared with other state-of-the-art deep learning-based networks, and the results proved the superiority of the networks in multi-object segmentation tasks.

Download Full-text

A Robust Algorithm Based on Phase Congruency for Optical and SAR Image Registration in Suburban Areas

Remote Sensing ◽

10.3390/rs12203339 ◽

2020 ◽

Vol 12 (20) ◽

pp. 3339

Author(s):

Lina Wang ◽

Mingchao Sun ◽

Jinghong Liu ◽

Lihua Cao ◽

Guoqing Ma

Keyword(s):

Image Registration ◽

Feature Detection ◽

Speckle Noise ◽

Feature Descriptor ◽

Phase Congruency ◽

Sar Images ◽

Robust Algorithm ◽

Multi Scale ◽

Edge Points ◽

Index Maps

Automatic registration of optical and synthetic aperture radar (SAR) images is a challenging task due to the influence of SAR speckle noise and nonlinear radiometric differences. This study proposes a robust algorithm based on phase congruency to register optical and SAR images (ROS-PC). It consists of a uniform Harris feature detection method based on multi-moment of the phase congruency map (UMPC-Harris) and a local feature descriptor based on the histogram of phase congruency orientation on multi-scale max amplitude index maps (HOSMI). The UMPC-Harris detects corners and edge points based on a voting strategy, the multi-moment of phase congruency maps, and an overlapping block strategy, which is used to detect stable and uniformly distributed keypoints. Subsequently, HOSMI is derived for a keypoint by utilizing the histogram of phase congruency orientation on multi-scale max amplitude index maps, which effectively increases the discriminability and robustness of the final descriptor. Finally, experimental results obtained using simulated images show that the UMPC-Harris detector has a superior repeatability rate. The image registration results obtained on test images show that the ROS-PC is robust against SAR speckle noise and nonlinear radiometric differences. The ROS-PC can tolerate some rotational and scale changes.

Download Full-text

Cross-Domain Semantic Segmentation of Urban Scenes via Multi-Level Feature Alignment

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9411915 ◽

2021 ◽

Author(s):

Bin Zhang ◽

Shengjie Zhao ◽

Rongqing Zhang

Keyword(s):

Semantic Segmentation ◽

Urban Scenes ◽

Cross Domain ◽

Multi Level ◽

Feature Alignment

Download Full-text

A Novel Focal Phi Loss for Power Line Segmentation with Auxiliary Classifier U-Net

Sensors ◽

10.3390/s21082803 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2803

Author(s):

Rabeea Jaffari ◽

Manzoor Ahmed Hashmani ◽

Constantino Carlos Reyes-Aldasoro

Keyword(s):

Loss Function ◽

Class Imbalance ◽

Power Line ◽

Aerial Images ◽

Class Imbalance Problem ◽

Trade Off ◽

Urban Scenes ◽

Imbalance Problem ◽

A Minor ◽

Evaluation Parameters

The segmentation of power lines (PLs) from aerial images is a crucial task for the safe navigation of unmanned aerial vehicles (UAVs) operating at low altitudes. Despite the advances in deep learning-based approaches for PL segmentation, these models are still vulnerable to the class imbalance present in the data. The PLs occupy only a minimal portion (1–5%) of the aerial images as compared to the background region (95–99%). Generally, this class imbalance problem is addressed via the use of PL-specific detectors in conjunction with the popular class balanced cross entropy (BBCE) loss function. However, these PL-specific detectors do not work outside their application areas and a BBCE loss requires hyperparameter tuning for class-wise weights, which is not trivial. Moreover, the BBCE loss results in low dice scores and precision values and thus, fails to achieve an optimal trade-off between dice scores, model accuracy, and precision–recall values. In this work, we propose a generalized focal loss function based on the Matthews correlation coefficient (MCC) or the Phi coefficient to address the class imbalance problem in PL segmentation while utilizing a generic deep segmentation architecture. We evaluate our loss function by improving the vanilla U-Net model with an additional convolutional auxiliary classifier head (ACU-Net) for better learning and faster model convergence. The evaluation of two PL datasets, namely the Mendeley Power Line Dataset and the Power Line Dataset of Urban Scenes (PLDU), where PLs occupy around 1% and 2% of the aerial images area, respectively, reveal that our proposed loss function outperforms the popular BBCE loss by 16% in PL dice scores on both the datasets, 19% in precision and false detection rate (FDR) values for the Mendeley PL dataset and 15% in precision and FDR values for the PLDU with a minor degradation in the accuracy and recall values. Moreover, our proposed ACU-Net outperforms the baseline vanilla U-Net for the characteristic evaluation parameters in the range of 1–10% for both the PL datasets. Thus, our proposed loss function with ACU-Net achieves an optimal trade-off for the characteristic evaluation parameters without any bells and whistles. Our code is available at Github.

Download Full-text

Orchard Mapping with Deep Learning Semantic Segmentation

Sensors ◽

10.3390/s21113813 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3813

Author(s):

Athanasios Anagnostis ◽

Aristotelis C. Tagarakis ◽

Dimitrios Kateris ◽

Vasileios Moysiadis ◽

Claus Grøn Sørensen ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Semantic Segmentation ◽

Automated Detection ◽

Aerial Images ◽

Training Dataset ◽

Field Boundary ◽

Different Seasons ◽

Detection And Localization ◽

Different Levels

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.

Download Full-text

Understanding Rooftop PV Panel Semantic Segmentation of Satellite and Aerial Images for Better Using Machine Learning

Advances in Applied Energy ◽

10.1016/j.adapen.2021.100057 ◽

2021 ◽

pp. 100057

Author(s):

Peiran Li ◽

Haoran Zhang ◽

Zhiling Guo ◽

Suxing Lyu ◽

Jinyu Chen ◽

...

Keyword(s):

Machine Learning ◽

Semantic Segmentation ◽

Aerial Images ◽

Pv Panel

Download Full-text

Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3065112 ◽

2021 ◽

pp. 1-18

Author(s):

Ruigang Niu ◽

Xian Sun ◽

Yu Tian ◽

Wenhui Diao ◽

Kaiqiang Chen ◽

...

Keyword(s):

Semantic Segmentation ◽

Aerial Images ◽

Attention Network

Download Full-text

Towards Scalable Economic Photovoltaic Potential Analysis Using Aerial Images and Deep Learning

Energies ◽

10.3390/en14133800 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3800

Author(s):

Sebastian Krapf ◽

Nils Kemmerzell ◽

Syed Khawaja Haseeb Khawaja Haseeb Uddin ◽

Manuel Hack Hack Vázquez ◽

Fabian Netzler ◽

...

Keyword(s):

Deep Learning ◽

System Analysis ◽

State Of The Art ◽

Critical Role ◽

Semantic Segmentation ◽

Energy System ◽

Aerial Images ◽

Potential Analysis ◽

3D Data ◽

Challenges And Opportunities

Roof-mounted photovoltaic systems play a critical role in the global transition to renewable energy generation. An analysis of roof photovoltaic potential is an important tool for supporting decision-making and for accelerating new installations. State of the art uses 3D data to conduct potential analyses with high spatial resolution, limiting the study area to places with available 3D data. Recent advances in deep learning allow the required roof information from aerial images to be extracted. Furthermore, most publications consider the technical photovoltaic potential, and only a few publications determine the photovoltaic economic potential. Therefore, this paper extends state of the art by proposing and applying a methodology for scalable economic photovoltaic potential analysis using aerial images and deep learning. Two convolutional neural networks are trained for semantic segmentation of roof segments and superstructures and achieve an Intersection over Union values of 0.84 and 0.64, respectively. We calculated the internal rate of return of each roof segment for 71 buildings in a small study area. A comparison of this paper’s methodology with a 3D-based analysis discusses its benefits and disadvantages. The proposed methodology uses only publicly available data and is potentially scalable to the global level. However, this poses a variety of research challenges and opportunities, which are summarized with a focus on the application of deep learning, economic photovoltaic potential analysis, and energy system analysis.

Download Full-text

RSNet: Rail semantic segmentation network for extracting aerial railroad images

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210349 ◽

2021 ◽

pp. 1-18

Author(s):

R.S. Rampriya ◽

Sabarinathan ◽

R. Suganya

Keyword(s):

Real Time ◽

Visual Processing ◽

Feature Fusion ◽

Semantic Segmentation ◽

Vital Role ◽

Obstacle Detection ◽

Aerial Images ◽

Computationally Efficient ◽

Fusion Algorithm ◽

Uav Images

In the near future, combo of UAV (Unmanned Aerial Vehicle) and computer vision will play a vital role in monitoring the condition of the railroad periodically to ensure passenger safety. The most significant module involved in railroad visual processing is obstacle detection, in which caution is obstacle fallen near track gage inside or outside. This leads to the importance of detecting and segment the railroad as three key regions, such as gage inside, rails, and background. Traditional railroad segmentation methods depend on either manual feature selection or expensive dedicated devices such as Lidar, which is typically less reliable in railroad semantic segmentation. Also, cameras mounted on moving vehicles like a drone can produce high-resolution images, so segmenting precise pixel information from those aerial images has been challenging due to the railroad surroundings chaos. RSNet is a multi-level feature fusion algorithm for segmenting railroad aerial images captured by UAV and proposes an attention-based efficient convolutional encoder for feature extraction, which is robust and computationally efficient and modified residual decoder for segmentation which considers only essential features and produces less overhead with higher performance even in real-time railroad drone imagery. The network is trained and tested on a railroad scenic view segmentation dataset (RSSD), which we have built from real-time UAV images and achieves 0.973 dice coefficient and 0.94 jaccard on test data that exhibits better results compared to the existing approaches like a residual unit and residual squeeze net.

Download Full-text

MTI-YOLO: A Light-Weight and Real-Time Deep Neural Network for Insulator Detection in Complex Aerial Images

Energies ◽

10.3390/en14051426 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1426

Author(s):

Chuanyang Liu ◽

Yiquan Wu ◽

Jingjing Liu ◽

Jiaming Han

Keyword(s):

Feature Detection ◽

Feature Fusion ◽

Memory Storage ◽

Aerial Images ◽

Detection Accuracy ◽

Composite Insulator ◽

Running Time ◽

Scale Feature ◽

Multi Scale ◽

Good Trade

Insulator detection is an essential task for the safety and reliable operation of intelligent grids. Owing to insulator images including various background interferences, most traditional image-processing methods cannot achieve good performance. Some You Only Look Once (YOLO) networks are employed to meet the requirements of actual applications for insulator detection. To achieve a good trade-off among accuracy, running time, and memory storage, this work proposes the modified YOLO-tiny for insulator (MTI-YOLO) network for insulator detection in complex aerial images. First of all, composite insulator images are collected in common scenes and the “CCIN_detection” (Chinese Composite INsulator) dataset is constructed. Secondly, to improve the detection accuracy of different sizes of insulator, multi-scale feature detection headers, a structure of multi-scale feature fusion, and the spatial pyramid pooling (SPP) model are adopted to the MTI-YOLO network. Finally, the proposed MTI-YOLO network and the compared networks are trained and tested on the “CCIN_detection” dataset. The average precision (AP) of our proposed network is 17% and 9% higher than YOLO-tiny and YOLO-v2. Compared with YOLO-tiny and YOLO-v2, the running time of the proposed network is slightly higher. Furthermore, the memory usage of the proposed network is 25.6% and 38.9% lower than YOLO-v2 and YOLO-v3, respectively. Experimental results and analysis validate that the proposed network achieves good performance in both complex backgrounds and bright illumination conditions.

Download Full-text