Improved Anchor-Free Instance Segmentation for Building Extraction from High-Resolution Remote Sensing Images

Tong Wu; Yuan Hu; Ling Peng; Ruonan Chen

doi:10.3390/rs12182910

Improved Anchor-Free Instance Segmentation for Building Extraction from High-Resolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs12182910 ◽

2020 ◽

Vol 12 (18) ◽

pp. 2910

Author(s):

Tong Wu ◽

Yuan Hu ◽

Ling Peng ◽

Ruonan Chen

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Aerial Images ◽

Building Extraction ◽

Remote Sensing Images ◽

Highly Sensitive ◽

Segmentation Methods ◽

Speed And Accuracy ◽

Instance Segmentation

Building extraction from high-resolution remote sensing images plays a vital part in urban planning, safety supervision, geographic databases updates, and some other applications. Several researches are devoted to using convolutional neural network (CNN) to extract buildings from high-resolution satellite/aerial images. There are two major methods, one is the CNN-based semantic segmentation methods, which can not distinguish different objects of the same category and may lead to edge connection. The other one is CNN-based instance segmentation methods, which rely heavily on pre-defined anchors, and result in the highly sensitive, high computation/storage cost and imbalance between positive and negative samples. Therefore, in this paper, we propose an improved anchor-free instance segmentation method based on CenterMask with spatial and channel attention-guided mechanisms and improved effective backbone network for accurate extraction of buildings in high-resolution remote sensing images. Then we analyze the influence of different parameters and network structure on the performance of the model, and compare the performance for building extraction of Mask R-CNN, Mask Scoring R-CNN, CenterMask, and the improved CenterMask in this paper. Experimental results show that our improved CenterMask method can successfully well-balanced performance in terms of speed and accuracy, which achieves state-of-the-art performance at real-time speed.

Download Full-text

Self-Attention in Reconstruction Bias U-Net for Semantic Segmentation of Building Rooftops in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13132524 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2524

Author(s):

Ziyi Chen ◽

Dilong Li ◽

Wentao Fan ◽

Haiyan Guan ◽

Cheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Semantic Segmentation ◽

Extraction Methods ◽

The Self ◽

Optical Remote Sensing ◽

Building Extraction ◽

Learning Models ◽

Remote Sensing Images ◽

Segmentation Methods

Deep learning models have brought great breakthroughs in building extraction from high-resolution optical remote-sensing images. Among recent research, the self-attention module has called up a storm in many fields, including building extraction. However, most current deep learning models loading with the self-attention module still lose sight of the reconstruction bias’s effectiveness. Through tipping the balance between the abilities of encoding and decoding, i.e., making the decoding network be much more complex than the encoding network, the semantic segmentation ability will be reinforced. To remedy the research weakness in combing self-attention and reconstruction-bias modules for building extraction, this paper presents a U-Net architecture that combines self-attention and reconstruction-bias modules. In the encoding part, a self-attention module is added to learn the attention weights of the inputs. Through the self-attention module, the network will pay more attention to positions where there may be salient regions. In the decoding part, multiple large convolutional up-sampling operations are used for increasing the reconstruction ability. We test our model on two open available datasets: the WHU and Massachusetts Building datasets. We achieve IoU scores of 89.39% and 73.49% for the WHU and Massachusetts Building datasets, respectively. Compared with several recently famous semantic segmentation methods and representative building extraction methods, our method’s results are satisfactory.

Download Full-text

HQ-ISNet: High-Quality Instance Segmentation for Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs12060989 ◽

2020 ◽

Vol 12 (6) ◽

pp. 989 ◽

Cited By ~ 1

Author(s):

Hao Su ◽

Shunjun Wei ◽

Shan Liu ◽

Jiadian Liang ◽

Chen Wang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Object Detection ◽

Prediction Accuracy ◽

Semantic Segmentation ◽

Remote Sensing Images ◽

Feature Maps ◽

High Quality ◽

Remote Sensing Imagery ◽

Instance Segmentation

Instance segmentation in high-resolution (HR) remote sensing imagery is one of the most challenging tasks and is more difficult than object detection and semantic segmentation tasks. It aims to predict class labels and pixel-wise instance masks to locate instances in an image. However, there are rare methods currently suitable for instance segmentation in the HR remote sensing images. Meanwhile, it is more difficult to implement instance segmentation due to the complex background of remote sensing images. In this article, a novel instance segmentation approach of HR remote sensing imagery based on Cascade Mask R-CNN is proposed, which is called a high-quality instance segmentation network (HQ-ISNet). In this scheme, the HQ-ISNet exploits a HR feature pyramid network (HRFPN) to fully utilize multi-level feature maps and maintain HR feature maps for remote sensing images’ instance segmentation. Next, to refine mask information flow between mask branches, the instance segmentation network version 2 (ISNetV2) is proposed to promote further improvements in mask prediction accuracy. Then, we construct a new, more challenging dataset based on the synthetic aperture radar (SAR) ship detection dataset (SSDD) and the Northwestern Polytechnical University very-high-resolution 10-class geospatial object detection dataset (NWPU VHR-10) for remote sensing images instance segmentation which can be used as a benchmark for evaluating instance segmentation algorithms in the high-resolution remote sensing images. Finally, extensive experimental analyses and comparisons on the SSDD and the NWPU VHR-10 dataset show that (1) the HRFPN makes the predicted instance masks more accurate, which can effectively enhance the instance segmentation performance of the high-resolution remote sensing imagery; (2) the ISNetV2 is effective and promotes further improvements in mask prediction accuracy; (3) our proposed framework HQ-ISNet is effective and more accurate for instance segmentation in the remote sensing imagery than the existing algorithms.

Download Full-text

Building Extraction and Number Statistics in WUI Areas Based on UNet Structure and Ensemble Learning

Remote Sensing ◽

10.3390/rs13061172 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1172

Author(s):

De-Yue Chen ◽

Ling Peng ◽

Wei-Chao Li ◽

Yin-Da Wang

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

High Resolution ◽

Ensemble Learning ◽

Semantic Segmentation ◽

Regional Governance ◽

Building Extraction ◽

Contour Extraction ◽

Remote Sensing Images ◽

Building Information

Following the advancement and progression of urbanization, management problems of the wildland–urban interface (WUI) have become increasingly serious. WUI regional governance issues involve many factors including climate, humanities, etc., and have attracted attention and research from all walks of life. Building research plays a vital part in the WUI area. Building location is closely related with the planning and management of the WUI area, and the number of buildings is related to the rescue arrangement. There are two major methods to obtain this building information: one is to obtain them from relevant agencies, which is slow and lacks timeliness, while the other approach is to extract them from high-resolution remote sensing images, which is relatively inexpensive and offers improved timeliness. Inspired by the recent successful application of deep learning, in this paper, we propose a method for extracting building information from high-resolution remote sensing images based on deep learning, which is combined with ensemble learning to extract the building location. Further, we use the idea of image anomaly detection to estimate the number of buildings. After verification on two datasets, we obtain superior semantic segmentation results and achieve better building contour extraction and number estimation.

Download Full-text

Mapping Plastic Mulched Farmland for High Resolution Images of Unmanned Aerial Vehicle Using Deep Semantic Segmentation

Remote Sensing ◽

10.3390/rs11172008 ◽

2019 ◽

Vol 11 (17) ◽

pp. 2008 ◽

Cited By ~ 4

Author(s):

Qinchen Yang ◽

Man Liu ◽

Zhitao Zhang ◽

Shuqin Yang ◽

Jifeng Ning ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Unmanned Aerial Vehicle ◽

Semantic Segmentation ◽

Classification Method ◽

Remote Sensing Images ◽

Segmentation Methods ◽

Traditional Classification ◽

Aerial Vehicle ◽

Segmentation Models

With increasing consumption, plastic mulch benefits agriculture by promoting crop quality and yield, but the environmental and soil pollution is becoming increasingly serious. Therefore, research on the monitoring of plastic mulched farmland (PMF) has received increasing attention. Plastic mulched farmland in unmanned aerial vehicle (UAV) remote images due to the high resolution, shows a prominent spatial pattern, which brings difficulties to the task of monitoring PMF. In this paper, through a comparison between two deep semantic segmentation methods, SegNet and fully convolutional networks (FCN), and a traditional classification method, Support Vector Machine (SVM), we propose an end-to-end deep-learning method aimed at accurately recognizing PMF for UAV remote sensing images from Hetao Irrigation District, Inner Mongolia, China. After experiments with single-band, three-band and six-band image data, we found that deep semantic segmentation models built via single-band data which only use the texture pattern of PMF can identify it well; for example, SegNet reaching the highest accuracy of 88.68% in a 900 nm band. Furthermore, with three visual bands and six-band data (3 visible bands and 3 near-infrared bands), deep semantic segmentation models combining the texture and spectral features further improve the accuracy of PMF identification, whereas six-band data obtains an optimal performance for FCN and SegNet. In addition, deep semantic segmentation methods, FCN and SegNet, due to their strong feature extraction capability and direct pixel classification, clearly outperform the traditional SVM method in precision and speed. Among three classification methods, SegNet model built on three-band and six-band data obtains the optimal average accuracy of 89.62% and 90.6%, respectively. Therefore, the proposed deep semantic segmentation model, when tested against the traditional classification method, provides a promising path for mapping PMF in UAV remote sensing images.

Download Full-text

Conditional Generative Adversarial Network-Based Training Sample Set Improvement Model for the Semantic Segmentation of High-Resolution Remote Sensing Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2020.3033816 ◽

2020 ◽

pp. 1-17

Author(s):

Xin Pan ◽

Jian Zhao ◽

Jun Xu

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Training Sample ◽

Remote Sensing Images ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Sample Set

Download Full-text

Semantic Segmentation of High Resolution Remote Sensing Images with Extra Context Attention Mechanism

2020 IEEE 20th International Conference on Communication Technology (ICCT) ◽

10.1109/icct50939.2020.9295814 ◽

2020 ◽

Author(s):

Weifu Fu ◽

Qing Peng ◽

Yanxiang Gong ◽

Mei Xie ◽

Shicheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Attention Mechanism ◽

Remote Sensing Images

Download Full-text

HRCNet: High-Resolution Context Extraction Network for Semantic Segmentation of Remote Sensing Images

Remote Sensing ◽

10.3390/rs13010071 ◽

2020 ◽

Vol 13 (1) ◽

pp. 71

Author(s):

Zhiyong Xu ◽

Weicun Zhang ◽

Tianxiang Zhang ◽

Jiangyun Li

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

High Resolution ◽

Spatial Information ◽

Semantic Segmentation ◽

Context Information ◽

Remote Sensing Images ◽

Global Context ◽

Boundary Information ◽

Extraction Stage

Semantic segmentation is a significant method in remote sensing image (RSIs) processing and has been widely used in various applications. Conventional convolutional neural network (CNN)-based semantic segmentation methods are likely to lose the spatial information in the feature extraction stage and usually pay little attention to global context information. Moreover, the imbalance of category scale and uncertain boundary information meanwhile exists in RSIs, which also brings a challenging problem to the semantic segmentation task. To overcome these problems, a high-resolution context extraction network (HRCNet) based on a high-resolution network (HRNet) is proposed in this paper. In this approach, the HRNet structure is adopted to keep the spatial information. Moreover, the light-weight dual attention (LDA) module is designed to obtain global context information in the feature extraction stage and the feature enhancement feature pyramid (FEFP) structure is promoted and employed to fuse the contextual information of different scales. In addition, to achieve the boundary information, we design the boundary aware (BA) module combined with the boundary aware loss (BAloss) function. The experimental results evaluated on Potsdam and Vaihingen datasets show that the proposed approach can significantly improve the boundary and segmentation performance up to 92.0% and 92.3% on overall accuracy scores, respectively. As a consequence, it is envisaged that the proposed HRCNet model will be an advantage in remote sensing images segmentation.

Download Full-text

Efficient Patch-Wise Semantic Segmentation for Large-Scale Remote Sensing Images

Sensors ◽

10.3390/s18103232 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3232 ◽

Cited By ~ 17

Author(s):

Yan Liu ◽

Qirui Ren ◽

Jiahui Geng ◽

Meng Ding ◽

Jiangyun Li

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Large Scale ◽

Semantic Segmentation ◽

Remote Sensing Image ◽

Training Data ◽

Land Resources ◽

Remote Sensing Images ◽

Training Strategy ◽

The Impact

Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation.

Download Full-text

A Semantic Segmentation Approach Based on DeepLab Network in High-Resolution Remote Sensing Images

Lecture Notes in Computer Science - Image and Graphics ◽

10.1007/978-3-030-34113-8_25 ◽

2019 ◽

pp. 292-304

Author(s):

Hangtao Hu ◽

Shuo Cai ◽

Wei Wang ◽

Peng Zhang ◽

Zhiyong Li

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Remote Sensing Images ◽

Segmentation Approach

Download Full-text

Mapping the Unseen: Exploiting Super-Resolution for Semantic Segmentation in Low-Resolution Images

10.5753/sibgrapi.est.2020.12987 ◽

2020 ◽

Author(s):

Matheus B. Pereira ◽

Jefersson Alex Dos Santos

Keyword(s):

Remote Sensing ◽

Pattern Recognition ◽

Super Resolution ◽

Remote Sensing Data ◽

Semantic Segmentation ◽

The Other ◽

Aerial Imagery ◽

Aerial Images ◽

Remote Sensing Images ◽

Low Resolution

High-resolution aerial images are usually not accessible or affordable. On the other hand, low-resolution remote sensing data is easily found in public open repositories. The problem is that the low-resolution representation can compromise pattern recognition algorithms, especially semantic segmentation. In this M.Sc. dissertation1 , we design two frameworks in order to evaluate the effectiveness of super-resolution in the semantic segmentation of low-resolution remote sensing images. We carried out an extensive set of experiments on different remote sensing datasets. The results show that super-resolution is effective to improve semantic segmentation performance on low-resolution aerial imagery, outperforming unsupervised interpolation and achieving semantic segmentation results comparable to highresolution data.

Download Full-text