Clouds Classification from Sentinel-2 Imagery with Deep Residual Learning and Semantic Image Segmentation

Cheng-Chien Liu; Yu-Cheng Zhang; Pei-Yin Chen; Chien-Chih Lai; Yi-Hsin Chen; Ji-Hong Cheng; Ming-Hsun Ko

doi:10.3390/rs11020119

Clouds Classification from Sentinel-2 Imagery with Deep Residual Learning and Semantic Image Segmentation

Remote Sensing ◽

10.3390/rs11020119 ◽

2019 ◽

Vol 11 (2) ◽

pp. 119 ◽

Cited By ~ 15

Author(s):

Cheng-Chien Liu ◽

Yu-Cheng Zhang ◽

Pei-Yin Chen ◽

Chien-Chih Lai ◽

Yi-Hsin Chen ◽

...

Keyword(s):

Image Segmentation ◽

Change Detection ◽

Data Augmentation ◽

Satellite Image ◽

Training Data ◽

Training Dataset ◽

Standard Size ◽

Semantic Image Segmentation ◽

Residual Learning ◽

Sentinel 2

Detecting changes in land use and land cover (LULC) from space has long been the main goal of satellite remote sensing (RS), yet the existing and available algorithms for cloud classification are not reliable enough to attain this goal in an automated fashion. Clouds are very strong optical signals that dominate the results of change detection if they are not removed completely from imagery. As various architectures of deep learning (DL) have been proposed and advanced quickly, their potential in perceptual tasks has been widely accepted and successfully applied to many fields. A comprehensive survey of DL in RS has been reviewed, and the RS community has been suggested to be leading researchers in DL. Based on deep residual learning, semantic image segmentation, and the concept of atrous convolution, we propose a new DL architecture, named CloudNet, with an enhanced capability of feature extraction for classifying cloud and haze from Sentinel-2 imagery, with the intention of supporting automatic change detection in LULC. To ensure the quality of the training dataset, scene classification maps of Taiwan processed by Sen2cor were visually examined and edited, resulting in a total of 12,769 sub-images with a standard size of 224 × 224 pixels, cut from the Sen2cor-corrected images and compiled in a trainset. The data augmentation technique enabled CloudNet to have stable cirrus identification capability without extensive training data. Compared to the traditional method and other DL methods, CloudNet had higher accuracy in cloud and haze classification, as well as better performance in cirrus cloud recognition. CloudNet will be incorporated into the Open Access Satellite Image Service to facilitate change detection by using Sentinel-2 imagery on a regular and automatic basis.

Download Full-text

Deep learning based F-Mask alternative for Sentinel-2 images in polar regions

10.5194/egusphere-egu21-15914 ◽

2021 ◽

Author(s):

Thorsten Seehaus ◽

Kamal Nambiar Gopikrishnan ◽

Veniamin Morgenshtern ◽

Philipp Hochreuther ◽

Matthias Braun

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Satellite Image ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Validation Dataset ◽

Polar Regions ◽

Unseen Data ◽

Sentinel 2

<div>Screening clouds, cloud shadows, and snow is a critical pre-processing step that needs to be performed before any meaningful analysis can be done on satellite image data. The state of the art 'F-Mask' algorithm, which is based on multiple pixel-level threshold tests, segments the image into clear land, cloud, cloud shadow, snow, and water classes. However, we observe that the results of this algorithm are not very accurate in polar and tundra regions. The unavailability of labeled Sentinel-2 training datasets with these classes makes the traditional supervised machine learning techniques difficult to implement. Experiments with large, noisy training data on standard deep learning classification tasks like CIFAR-10 and ImageNet have shown neural networks learn clean labels faster than noisy labels.&#160;</div><div>We present a multi-level self-learning approach that trains a model to perform semantic segmentation on Sentinel-2 L1C images. We use a large dataset with labels annotated using the F-mask algorithm for the training, and a small human-labeled dataset for validation. The validation dataset contains numerous examples where the F-mask classification would have given incorrect labels. At the first step, a deep neural network with a modified U-Net architecture is trained using a dataset automatically labeled with the F-mask algorithm. The performance on the validation dataset is used to select the best model from the step, which would then be used to generate more training labels from previously unseen data. In each of the subsequent steps, a new model is trained using the labels generated using the model from the previous step. The amount of data used for training increases with each step and the application of techniques like data augmentation and dropout improves the generalization of the trained model. We show that the final model from our approach can outperform its teacher, i.e. F-Mask algorithm.&#160;</div>

Download Full-text

Improving Image-Based Plant Disease Classification With Generative Adversarial Network Under Limited Training Set

Frontiers in Plant Science ◽

10.3389/fpls.2020.583438 ◽

2020 ◽

Vol 11 ◽

Author(s):

Luning Bi ◽

Guiping Hu

Keyword(s):

Classification Accuracy ◽

Plant Disease ◽

Data Augmentation ◽

Plant Diseases ◽

Disease Classification ◽

Training Data ◽

Training Dataset ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Overfitting Problem

Traditionally, plant disease recognition has mainly been done visually by human. It is often biased, time-consuming, and laborious. Machine learning methods based on plant leave images have been proposed to improve the disease recognition process. Convolutional neural networks (CNNs) have been adopted and proven to be very effective. Despite the good classification accuracy achieved by CNNs, the issue of limited training data remains. In most cases, the training dataset is often small due to significant effort in data collection and annotation. In this case, CNN methods tend to have the overfitting problem. In this paper, Wasserstein generative adversarial network with gradient penalty (WGAN-GP) is combined with label smoothing regularization (LSR) to improve the prediction accuracy and address the overfitting problem under limited training data. Experiments show that the proposed WGAN-GP enhanced classification method can improve the overall classification accuracy of plant diseases by 24.4% as compared to 20.2% using classic data augmentation and 22% using synthetic samples without LSR.

Download Full-text

ObjectAug: Object-level Data Augmentation for Semantic Image Segmentation

10.1109/ijcnn52387.2021.9534020 ◽

2021 ◽

Author(s):

Jiawei Zhang ◽

Yanchun Zhang ◽

Xiaowei Xu

Keyword(s):

Image Segmentation ◽

Data Augmentation ◽

Semantic Image Segmentation ◽

Level Data ◽

Object Level

Download Full-text

Generative Adversarial Networks for Pre-training of Medical Image Segmentation Networks

10.21203/rs.2.22269/v1 ◽

2020 ◽

Author(s):

Kun Chen ◽

Manning Wang ◽

Zhijian Song

Keyword(s):

Neural Networks ◽

Image Segmentation ◽

Medical Image ◽

Deep Neural Networks ◽

Data Augmentation ◽

Medical Image Segmentation ◽

Training Data ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Synthetic Images

Abstract Background: Deep neural networks have been widely used in medical image segmentation and have achieved state-of-the-art performance in many tasks. However, different from the segmentation of natural images or video frames, the manual segmentation of anatomical structures in medical images needs high expertise so the scale of labeled training data is very small, which is a major obstacle for the improvement of deep neural networks performance in medical image segmentation. Methods: In this paper, we proposed a new end-to-end generation-segmentation framework by integrating Generative Adversarial Networks (GAN) and a segmentation network and train them simultaneously. The novelty is that during the training of the GAN, the intermediate synthetic images generated by the generator of the GAN are used to pre-train the segmentation network. As the advances of the training of the GAN, the synthetic images evolve gradually from being very coarse to containing more realistic textures, and these images help train the segmentation network gradually. After the training of GAN, the segmentation network is then fine-tuned by training with the real labeled images. Results: We evaluated the proposed framework on four different datasets, including 2D cardiac dataset and lung dataset, 3D prostate dataset and liver dataset. Compared with original U-net and CE-Net, our framework can achieve better segmentation performance. Our framework also can get better segmentation results than U-net on small datasets. In addition, our framework is more effective than the usual data augmentation methods. Conclusions: The proposed framework can be used as a pre-train method of segmentation network, which helps to get a better segmentation result. Our method can solve the shortcomings of current data augmentation methods to some extent.

Download Full-text

Electrocardiogram Quality Assessment with a Generalized Deep Learning Model Assisted by Conditional Generative Adversarial Networks

Life ◽

10.3390/life11101013 ◽

2021 ◽

Vol 11 (10) ◽

pp. 1013

Author(s):

Xue Zhou ◽

Xin Zhu ◽

Keijiro Nakamura ◽

Mahito Noro

Keyword(s):

Neural Network ◽

Quality Assessment ◽

Data Augmentation ◽

Disease Diagnosis ◽

Assessment System ◽

Assessment Model ◽

Training Data ◽

Training Dataset ◽

Generative Adversarial Networks ◽

Adversarial Networks

The electrocardiogram (ECG) is widely used for cardiovascular disease diagnosis and daily health monitoring. Before ECG analysis, ECG quality screening is an essential but time-consuming and experience-dependent work for technicians. An automatic ECG quality assessment method can reduce unnecessary time loss to help cardiologists perform diagnosis. This study aims to develop an automatic quality assessment system to search qualified ECGs for interpretation. The proposed system consists of data augmentation and quality assessment parts. For data augmentation, we train a conditional generative adversarial networks model to get an ECG segment generator, and thus to increase the number of training data. Then, we pre-train a deep quality assessment model based on a training dataset composed of real and generated ECG. Finally, we fine-tune the proposed model using real ECG and validate it on two different datasets composed of real ECG. The proposed system has a generalized performance on the two validation datasets. The model’s accuracy is 97.1% and 96.4%, respectively for the two datasets. The proposed method outperforms a shallow neural network model, and also a deep neural network models without being pre-trained by generated ECG. The proposed system demonstrates improved performance in the ECG quality assessment, and it has the potential to be an initial ECG quality screening tool in clinical practice.

Download Full-text

Identification of North American Softwoods via Machine-Learning

Canadian Journal of Forest Research ◽

10.1139/cjfr-2020-0416 ◽

2021 ◽

Author(s):

Dercilio Junior Verly Lopes ◽

Gabrielly dos Santos Bobadilha ◽

Greg W. Burgreen ◽

Edward D. Entsminger

Keyword(s):

Machine Learning ◽

North American ◽

Ponderosa Pine ◽

Data Augmentation ◽

Input Image ◽

Training Data ◽

Training Dataset ◽

Image Size ◽

Accurate Identification ◽

Convolutional Network

This manuscript reports the feasibility of a sequential convolutional neural network (CNN) machine-learning model that correctly identifies eleven (11) North American softwood species from 14x magnified macroscopic end-grain images. The convolutional network contained a large kernel size, max pooling layers, and leaky rectified linear units to accelerate training. To reduce overfitting of training data, we employed L2 regularization, custom initialization, and stratified 5-fold cross-validation techniques. The database consisted of 1,789 wood end-grain images. The training dataset consisted of 1,431 images, whereas the validation set had approximately 358 images. In both sets, the input image size was 227 pixels x 227 pixels. Data augmentation was performed on-the-fly by flipping, rotating, and zooming the images. We tested the performance of the CNN against precision, sensitivity, specificity, F1-score, and adjusted accuracy. The adjusted accuracy for the entire model was 94.0%. Confusion matrices indicated the lowest performance was in correctly classifying Ponderosa pine and Eastern spruce group with an average sensitivity of 89.0% for each. Even though high validation accuracy (>94.0%) was achieved, we concluded that a much larger dataset is needed for wood identification to obtain industrially accurate identification of softwoods, mainly due to their visual and macroscopic similarities.

Download Full-text

Pixel Level Data Augmentation for Semantic Image Segmentation Using Generative Adversarial Networks

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2019.8683590 ◽

2019 ◽

Cited By ~ 8

Author(s):

Shuangting Liu ◽

Jiaqi Zhang ◽

Yuxin Chen ◽

Yifan Liu ◽

Zengchang Qin ◽

...

Keyword(s):

Image Segmentation ◽

Data Augmentation ◽

Generative Adversarial Networks ◽

Semantic Image Segmentation ◽

Adversarial Networks ◽

Level Data

Download Full-text

A Contrast Augmentation Approach to Improve Multi-Scanner Generalization in MRI

Frontiers in Neuroscience ◽

10.3389/fnins.2021.708196 ◽

2021 ◽

Vol 15 ◽

Author(s):

Maria Ines Meyer ◽

Ezequiel de la Rosa ◽

Nuno Pedrosa de Barros ◽

Roberto Paolella ◽

Koen Van Leemput ◽

...

Keyword(s):

Data Augmentation ◽

Brain Mri ◽

Gaussian Mixture Models ◽

Brain Magnetic Resonance Imaging ◽

Bias Field ◽

Brain Structures ◽

Gaussian Mixture ◽

Training Data ◽

Training Dataset ◽

Clinical Scenarios

Most data-driven methods are very susceptible to data variability. This problem is particularly apparent when applying Deep Learning (DL) to brain Magnetic Resonance Imaging (MRI), where intensities and contrasts vary due to acquisition protocol, scanner- and center-specific factors. Most publicly available brain MRI datasets originate from the same center and are homogeneous in terms of scanner and used protocol. As such, devising robust methods that generalize to multi-scanner and multi-center data is crucial for transferring these techniques into clinical practice. We propose a novel data augmentation approach based on Gaussian Mixture Models (GMM-DA) with the goal of increasing the variability of a given dataset in terms of intensities and contrasts. The approach allows to augment the training dataset such that the variability in the training set compares to what is seen in real world clinical data, while preserving anatomical information. We compare the performance of a state-of-the-art U-Net model trained for segmenting brain structures with and without the addition of GMM-DA. The models are trained and evaluated on single- and multi-scanner datasets. Additionally, we verify the consistency of test-retest results on same-patient images (same and different scanners). Finally, we investigate how the presence of bias field influences the performance of a model trained with GMM-DA. We found that the addition of the GMM-DA improves the generalization capability of the DL model to other scanners not present in the training data, even when the train set is already multi-scanner. Besides, the consistency between same-patient segmentation predictions is improved, both for same-scanner and different-scanner repetitions. We conclude that GMM-DA could increase the transferability of DL models into clinical scenarios.

Download Full-text

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/682 ◽

2018 ◽

Cited By ~ 13

Author(s):

Zhang-Wei Hong ◽

Yu-Ming Chen ◽

Hsuan-Kung Yang ◽

Shih-Yang Su ◽

Tzu-Yun Shann ◽

...

Keyword(s):

Image Segmentation ◽

Virtual Worlds ◽

Control Policy ◽

Semantic Segmentation ◽

Physical World ◽

Training Data ◽

Avoidance Task ◽

Real Problem ◽

Modular Architecture ◽

Semantic Image Segmentation

Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform. Unfortunately, the reality gap between synthetic and real visual data prohibits direct migration of the models trained in virtual worlds to the real world. This paper proposes a modular architecture for tackling the virtual-to-real problem. The proposed architecture separates the learning model into a perception module and a control policy module, and uses semantic image segmentation as the meta representation for relating these two modules. The perception module translates the perceived RGB image to semantic image segmentation. The control policy module is implemented as a deep reinforcement learning agent, which performs actions based on the translated image segmentation. Our architecture is evaluated in an obstacle avoidance task and a target following task. Experimental results show that our architecture significantly outperforms all of the baseline methods in both virtual and real environments, and demonstrates a faster learning curve than them. We also present a detailed analysis for a variety of variant configurations, and validate the transferability of our modular architecture.

Download Full-text

Segmentation of Seismic Images

10.20948/graphicon-2021-3027-564-570 ◽

2021 ◽

Author(s):

Ekaterina Tolstaya ◽

Anton Egorov

Keyword(s):

Cross Sections ◽

Network Architecture ◽

Network Performance ◽

Data Augmentation ◽

Conditional Random Field ◽

Three Dimensional ◽

Data Cube ◽

Training Data ◽

Seismic Facies ◽

Training Dataset

In this paper we propose a method of seismic facies labeling. Given the three-dimensional image cube of seismic sounding data, labeled by a geologist, we first train on the part of the cube, then we propagate labels to the rest of the cube. We use open-source fully annotated 3D geological model of the Netherlands F3 Block. We apply state-of-the-art deep network architecture, adding on top a 3D fully connected conditional random field (CRF) layer. This allows to get smoother labels on data cube cross-sections. Pseudo labeling technique is used to overcome training data scarcity and predict more reliable labels for geological units. Additional data augmentation allows also to enlarge training dataset. The results show superior network performance over existing baseline mode.

Download Full-text