TransSounder: A Hybrid TransUNet-TransFuse Architectural Framework for Semantic Segmentation of Radar Sounder Data

Deep learning architectures have received much attention in recent years demonstrating state-of-the-art performance in several segmentation, classification and other computer vision tasks. Most of these deep networks are based on either convolutional or fully convolutional architectures. In this paper, we propose a novel object-based deep-learning framework for semantic segmentation in very high-resolution satellite data. In particular, we exploit object-based priors integrated into a fully convolutional neural network by incorporating an anisotropic diffusion data preprocessing step and an additional loss term during the training process. Under this constrained framework, the goal is to enforce pixels that belong to the same object to be classified at the same semantic category. We compared thoroughly the novel object-based framework with the currently dominating convolutional and fully convolutional deep networks. In particular, numerous experiments were conducted on the publicly available ISPRS WGII/4 benchmark datasets, namely Vaihingen and Potsdam, for validation and inter-comparison based on a variety of metrics. Quantitatively, experimental results indicate that, overall, the proposed object-based framework slightly outperformed the current state-of-the-art fully convolutional networks by more than 1% in terms of overall accuracy, while intersection over union results are improved for all semantic categories. Qualitatively, man-made classes with more strict geometry such as buildings were the ones that benefit most from our method, especially along object boundaries, highlighting the great potential of the developed approach.

Download Full-text

BUILDING OUTLINE EXTRACTION FROM AERIAL IMAGERY AND DIGITAL SURFACE MODEL WITH A FRAME FIELD LEARNING FRAMEWORK

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-487-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 487-493

Author(s):

X. Sun ◽

W. Zhao ◽

R. V. Maretto ◽

C. Persello

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Aerial Images ◽

Surface Model ◽

Digital Surface Model ◽

Frame Field ◽

Learning Framework ◽

Elevation Data ◽

3D Information ◽

Direction Information

Abstract. Deep learning-based semantic segmentation models for building delineation face the challenge of producing precise and regular building outlines. Recently, a building delineation method based on frame field learning was proposed by Girard et al. (2020) to extract regular building footprints as vector polygons directly from aerial RGB images. A fully convolution network (FCN) is trained to learn simultaneously the building mask, contours, and frame field followed by a polygonization method. With the direction information of the building contours stored in the frame field, the polygonization algorithm produces regular outlines accurately detecting edges and corners. This paper investigated the contribution of elevation data from the normalized digital surface model (nDSM) to extract accurate and regular building polygons. The 3D information provided by the nDSM overcomes the aerial images’ limitations and contributes to distinguishing the buildings from the background more accurately. Experiments conducted in Enschede, the Netherlands, demonstrate that the nDSM improves building outlines’ accuracy, resulting in better-aligned building polygons and prevents false positives. The investigated deep learning approach (fusing RGB + nDSM) results in a mean intersection over union (IOU) of 0.70 in the urban area. The baseline method (using RGB only) results in an IOU of 0.58 in the same area. A qualitative analysis of the results shows that the investigated model predicts more precise and regular polygons for large and complex structures.

Download Full-text

A Deep Learning Architecture for Semantic Segmentation of Radar Sounder Data

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3125773 ◽

2021 ◽

pp. 1-1

Author(s):

Elena Donini ◽

Francesca Bovolo ◽

Lorenzo Bruzzone

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Radar Sounder

Download Full-text

ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data

ISPRS Journal of Photogrammetry and Remote Sensing ◽

10.1016/j.isprsjprs.2020.01.013 ◽

2020 ◽

Vol 162 ◽

pp. 94-114 ◽

Cited By ~ 22

Author(s):

Foivos I. Diakogiannis ◽

François Waldner ◽

Peter Caccetta ◽

Chen Wu

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Remotely Sensed ◽

Remotely Sensed Data ◽

Learning Framework

Download Full-text

Three-Dimensional Semantic Segmentation of Pituitary Adenomas Based on the Deep Learning Framework-nnU-Net: A Clinical Perspective

Micromachines ◽

10.3390/mi12121473 ◽

2021 ◽

Vol 12 (12) ◽

pp. 1473

Author(s):

Xujun Shu ◽

Yijie Zhou ◽

Fangye Li ◽

Tao Zhou ◽

Xianghui Meng ◽

...

Keyword(s):

Deep Learning ◽

Clinical Practice ◽

Pituitary Adenomas ◽

Three Dimensional ◽

Cost Effective ◽

Semantic Segmentation ◽

Dice Similarity Coefficient ◽

Learning Framework ◽

Two Phases ◽

Practice Methods

This study developed and evaluated nnU-Net models for three-dimensional semantic segmentation of pituitary adenomas (PAs) from contrast-enhanced T1 (T1ce) images, with aims to train a deep learning-based model cost-effectively and apply it to clinical practice. Methods: This study was conducted in two phases. In phase one, two models were trained with nnUNet using distinct PA datasets. Model 1 was trained with 208 PAs in total, and model 2 was trained with 109 primary nonfunctional pituitary adenomas (NFPA). In phase two, the performances of the two models were investigated according to the Dice similarity coefficient (DSC) in the leave-out test dataset. Results: Both models performed well (DSC > 0.8) for PAs with volumes > 1000 mm3, but unsatisfactorily (DSC < 0.5) for PAs < 1000 mm3. Conclusions: Both nnU-Net models showed good segmentation performance for PAs > 1000 mm3 (75% of the dataset) and limited performance for PAs < 1000 mm3 (25% of the dataset). Model 2 trained with fewer samples was more cost-effective. We propose to combine the use of model-based segmentation for PA > 1000 mm3 and manual segmentation for PA < 1000 mm3 in clinical practice at the current stage.

Download Full-text

A Deep-Learning Framework for the Detection of Oil Spills from SAR Data

Sensors ◽

10.3390/s21072351 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2351

Author(s):

Mohamed Shaban ◽

Reem Salim ◽

Hadil Abu Khalifeh ◽

Adel Khelifi ◽

Ahmed Shalaby ◽

...

Keyword(s):

Deep Learning ◽

Oil Spill ◽

Oil Spills ◽

Semantic Segmentation ◽

Approximate Representation ◽

Sar Images ◽

Learning Framework ◽

Second Stage ◽

Unbalanced Dataset ◽

Land Surfaces

Oil leaks onto water surfaces from big tankers, ships, and pipeline cracks cause considerable damage and harm to the marine environment. Synthetic Aperture Radar (SAR) images provide an approximate representation for target scenes, including sea and land surfaces, ships, oil spills, and look-alikes. Detection and segmentation of oil spills from SAR images are crucial to aid in leak cleanups and protecting the environment. This paper introduces a two-stage deep-learning framework for the identification of oil spill occurrences based on a highly unbalanced dataset. The first stage classifies patches based on the percentage of oil spill pixels using a novel 23-layer Convolutional Neural Network. In contrast, the second stage performs semantic segmentation using a five-stage U-Net structure. The generalized Dice loss is minimized to account for the reduced oil spill representation in the patches. The results of this study are very promising and provide a comparable improved precision and Dice score compared to related work.

Download Full-text