scholarly journals MFCSNet: Multi-Scale Deep Features Fusion and Cost-Sensitive Loss Function Based Segmentation Network for Remote Sensing Images

2019 ◽  
Vol 9 (19) ◽  
pp. 4043
Author(s):  
Ende Wang ◽  
Yanmei Jiang ◽  
Yong Li ◽  
Jingchao Yang ◽  
Mengcheng Ren ◽  
...  

Semantic segmentation of remote sensing images is an important technique for spatial analysis and geocomputation. It has important applications in the fields of military reconnaissance, urban planning, resource utilization and environmental monitoring. In order to accurately perform semantic segmentation of remote sensing images, we proposed a novel multi-scale deep features fusion and cost-sensitive loss function based segmentation network, named MFCSNet. To acquire the information of different levels in remote sensing images, we design a multi-scale feature encoding and decoding structure, which can fuse the low-level and high-level semantic information. Then a max-pooling indices up-sampling structure is designed to improve the recognition rate of the object edge and location information in the remote sensing image. In addition, the cost-sensitive loss function is designed to improve the classification accuracy of objects with fewer samples. The penalty coefficient of misclassification is designed to improve the robustness of the network model, and the batch normalization layer is also added to make the network converge faster. The experimental results show that the classification performance of MFCSNet outperforms U-Net and SegNet in classification accuracy, object details and prediction consistency.

2021 ◽  
Vol 11 (1) ◽  
pp. 9
Author(s):  
Shengfu Li ◽  
Cheng Liao ◽  
Yulin Ding ◽  
Han Hu ◽  
Yang Jia ◽  
...  

Efficient and accurate road extraction from remote sensing imagery is important for applications related to navigation and Geographic Information System updating. Existing data-driven methods based on semantic segmentation recognize roads from images pixel by pixel, which generally uses only local spatial information and causes issues of discontinuous extraction and jagged boundary recognition. To address these problems, we propose a cascaded attention-enhanced architecture to extract boundary-refined roads from remote sensing images. Our proposed architecture uses spatial attention residual blocks on multi-scale features to capture long-distance relations and introduce channel attention layers to optimize the multi-scale features fusion. Furthermore, a lightweight encoder-decoder network is connected to adaptively optimize the boundaries of the extracted roads. Our experiments showed that the proposed method outperformed existing methods and achieved state-of-the-art results on the Massachusetts dataset. In addition, our method achieved competitive results on more recent benchmark datasets, e.g., the DeepGlobe and the Huawei Cloud road extraction challenge.


2021 ◽  
Vol 10 (7) ◽  
pp. 488
Author(s):  
Peng Li ◽  
Dezheng Zhang ◽  
Aziguli Wulamu ◽  
Xin Liu ◽  
Peng Chen

A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.


2020 ◽  
Vol 9 (4) ◽  
pp. 256 ◽  
Author(s):  
Liguo Weng ◽  
Yiming Xu ◽  
Min Xia ◽  
Yonghong Zhang ◽  
Jia Liu ◽  
...  

Changes on lakes and rivers are of great significance for the study of global climate change. Accurate segmentation of lakes and rivers is critical to the study of their changes. However, traditional water area segmentation methods almost all share the following deficiencies: high computational requirements, poor generalization performance, and low extraction accuracy. In recent years, semantic segmentation algorithms based on deep learning have been emerging. Addressing problems associated to a very large number of parameters, low accuracy, and network degradation during training process, this paper proposes a separable residual SegNet (SR-SegNet) to perform the water area segmentation using remote sensing images. On the one hand, without compromising the ability of feature extraction, the problem of network degradation is alleviated by adding modified residual blocks into the encoder, the number of parameters is limited by introducing depthwise separable convolutions, and the ability of feature extraction is improved by using dilated convolutions to expand the receptive field. On the other hand, SR-SegNet removes the convolution layers with relatively more convolution kernels in the encoding stage, and uses the cascading method to fuse the low-level and high-level features of the image. As a result, the whole network can obtain more spatial information. Experimental results show that the proposed method exhibits significant improvements over several traditional methods, including FCN, DeconvNet, and SegNet.


2019 ◽  
Vol 11 (18) ◽  
pp. 2153
Author(s):  
Zhiyong Lv ◽  
Guangfei Li ◽  
Yixiang Chen ◽  
Jón Atli Benediktsson

Filter is a well-known tool for noise reduction of very high spatial resolution (VHR) remote sensing images. However, a single-scale filter usually demonstrates limitations in covering various targets with different sizes and shapes in a given image scene. A novel method called multi-scale filter profile (MFP)-based framework (MFPF) is introduced in this study to improve the classification performance of a remote sensing image of VHR and address the aforementioned problem. First, an adaptive filter is extended with a series of parameters for MFP construction. Then, a layer-stacking technique is used to concatenate the MPFs and all the features into a stacked vector. Afterward, principal component analysis, a classical descending dimension algorithm, is performed on the fused profiles to reduce the redundancy of the stacked vector. Finally, the spatial adaptive region of each filter in the MFPs is used for post-processing of the obtained initial classification map through a supervised classifier. This process aims to revise the initial classification map and generate a final classification map. Experimental results performed on the three real VHR remote sensing images demonstrate the effectiveness of the proposed MFPF in comparison with the state-of-the-art methods. Hard-tuning parameters are unnecessary in the application of the proposed approach. Thus, such a method can be conveniently applied in real applications.


2020 ◽  
Vol 12 (5) ◽  
pp. 872 ◽  
Author(s):  
Ronghua Shang ◽  
Jiyu Zhang ◽  
Licheng Jiao ◽  
Yangyang Li ◽  
Naresh Marturi ◽  
...  

Semantic segmentation of high-resolution remote sensing images is highly challenging due to the presence of a complicated background, irregular target shapes, and similarities in the appearance of multiple target categories. Most of the existing segmentation methods that rely only on simple fusion of the extracted multi-scale features often fail to provide satisfactory results when there is a large difference in the target sizes. Handling this problem through multi-scale context extraction and efficient fusion of multi-scale features, in this paper we present an end-to-end multi-scale adaptive feature fusion network (MANet) for semantic segmentation in remote sensing images. It is a coding and decoding structure that includes a multi-scale context extraction module (MCM) and an adaptive fusion module (AFM). The MCM employs two layers of atrous convolutions with different dilatation rates and global average pooling to extract context information at multiple scales in parallel. MANet embeds the channel attention mechanism to fuse semantic features. The high- and low-level semantic information are concatenated to generate global features via global average pooling. These global features are used as channel weights to acquire adaptive weight information of each channel by the fully connected layer. To accomplish an efficient fusion, these tuned weights are applied to the fused features. Performance of the proposed method has been evaluated by comparing it with six other state-of-the-art networks: fully convolutional networks (FCN), U-net, UZ1, Light-weight RefineNet, DeepLabv3+, and APPD. Experiments performed using the publicly available Potsdam and Vaihingen datasets show that the proposed MANet significantly outperforms the other existing networks, with overall accuracy reaching 89.4% and 88.2%, respectively and with average of F1 reaching 90.4% and 86.7% respectively.


2020 ◽  
Vol 9 (4) ◽  
pp. 189 ◽  
Author(s):  
Hongxiang Guo ◽  
Guojin He ◽  
Wei Jiang ◽  
Ranyu Yin ◽  
Lei Yan ◽  
...  

Automatic water body extraction method is important for monitoring floods, droughts, and water resources. In this study, a new semantic segmentation convolutional neural network named the multi-scale water extraction convolutional neural network (MWEN) is proposed to automatically extract water bodies from GaoFen-1 (GF-1) remote sensing images. Three convolutional neural networks for semantic segmentation (fully convolutional network (FCN), Unet, and Deeplab V3+) are employed to compare with the water bodies extraction performance of MWEN. Visual comparison and five evaluation metrics are used to evaluate the performance of these convolutional neural networks (CNNs). The results show the following. (1) The results of water body extraction in multiple scenes using the MWEN are better than those of the other comparison methods based on the indicators. (2) The MWEN method has the capability to accurately extract various types of water bodies, such as urban water bodies, open ponds, and plateau lakes. (3) By fusing features extracted at different scales, the MWEN has the capability to extract water bodies with different sizes and suppress noise, such as building shadows and highways. Therefore, MWEN is a robust water extraction algorithm for GaoFen-1 satellite images and has the potential to conduct water body mapping with multisource high-resolution satellite remote sensing data.


2019 ◽  
Vol 11 (9) ◽  
pp. 1044 ◽  
Author(s):  
Wei Cui ◽  
Fei Wang ◽  
Xin He ◽  
Dongyou Zhang ◽  
Xuxiang Xu ◽  
...  

A comprehensive interpretation of remote sensing images involves not only remote sensing object recognition but also the recognition of spatial relations between objects. Especially in the case of different objects with the same spectrum, the spatial relationship can help interpret remote sensing objects more accurately. Compared with traditional remote sensing object recognition methods, deep learning has the advantages of high accuracy and strong generalizability regarding scene classification and semantic segmentation. However, it is difficult to simultaneously recognize remote sensing objects and their spatial relationship from end-to-end only relying on present deep learning networks. To address this problem, we propose a multi-scale remote sensing image interpretation network, called the MSRIN. The architecture of the MSRIN is a parallel deep neural network based on a fully convolutional network (FCN), a U-Net, and a long short-term memory network (LSTM). The MSRIN recognizes remote sensing objects and their spatial relationship through three processes. First, the MSRIN defines a multi-scale remote sensing image caption strategy and simultaneously segments the same image using the FCN and U-Net on different spatial scales so that a two-scale hierarchy is formed. The output of the FCN and U-Net are masked to obtain the location and boundaries of remote sensing objects. Second, using an attention-based LSTM, the remote sensing image captions include the remote sensing objects (nouns) and their spatial relationships described with natural language. Finally, we designed a remote sensing object recognition and correction mechanism to build the relationship between nouns in captions and object mask graphs using an attention weight matrix to transfer the spatial relationship from captions to objects mask graphs. In other words, the MSRIN simultaneously realizes the semantic segmentation of the remote sensing objects and their spatial relationship identification end-to-end. Experimental results demonstrated that the matching rate between samples and the mask graph increased by 67.37 percentage points, and the matching rate between nouns and the mask graph increased by 41.78 percentage points compared to before correction. The proposed MSRIN has achieved remarkable results.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Binglin Niu

High-resolution remote sensing images usually contain complex semantic information and confusing targets, so their semantic segmentation is an important and challenging task. To resolve the problem of inadequate utilization of multilayer features by existing methods, a semantic segmentation method for remote sensing images based on convolutional neural network and mask generation is proposed. In this method, the boundary box is used as the initial foreground segmentation profile, and the edge information of the foreground object is obtained by using the multilayer feature of the convolutional neural network. In order to obtain the rough object segmentation mask, the general shape and position of the foreground object are estimated by using the high-level features in the process of layer-by-layer iteration. Then, based on the obtained rough mask, the mask is updated layer by layer using the neural network characteristics to obtain a more accurate mask. In order to solve the difficulty of deep neural network training and the problem of degeneration after convergence, a framework based on residual learning was adopted, which can simplify the training of those very deep networks and improve the accuracy of the network. For comparison with other advanced algorithms, the proposed algorithm was tested on the Potsdam and Vaihingen datasets. Experimental results show that, compared with other algorithms, the algorithm in this article can effectively improve the overall precision of semantic segmentation of high-resolution remote sensing images and shorten the overall training time and segmentation time.


Sign in / Sign up

Export Citation Format

Share Document