Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures

Joao Fonseca; Georgios Douzas; Fernando Bacao

doi:10.3390/info12070266

Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures

Information ◽

10.3390/info12070266 ◽

2021 ◽

Vol 12 (7) ◽

pp. 266

Author(s):

Joao Fonseca ◽

Georgios Douzas ◽

Fernando Bacao

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Policy Development ◽

Class Imbalance ◽

Development Planning ◽

Remotely Sensed Data ◽

K Nearest Neighbors ◽

Automatic Production ◽

Spectral Signatures ◽

Land Cover Maps

Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Land Use/Land Cover maps has been a topic of interest for the remote sensing community for several years, but it is still fraught with technical challenges. One such challenge is the imbalanced nature of most remotely sensed data. The asymmetric class distribution impacts negatively the performance of classifiers and adds a new source of error to the production of these maps. In this paper, we address the imbalanced learning problem, by using K-means and the Synthetic Minority Oversampling Technique (SMOTE) as an improved oversampling algorithm. K-means SMOTE improves the quality of newly created artificial data by addressing both the between-class imbalance, as traditional oversamplers do, but also the within-class imbalance, avoiding the generation of noisy data while effectively overcoming data imbalance. The performance of K-means SMOTE is compared to three popular oversampling methods (Random Oversampling, SMOTE and Borderline-SMOTE) using seven remote sensing benchmark datasets, three classifiers (Logistic Regression, K-Nearest Neighbors and Random Forest Classifier) and three evaluation metrics using a five-fold cross-validation approach with three different initialization seeds. The statistical analysis of the results show that the proposed method consistently outperforms the remaining oversamplers producing higher quality land cover classifications. These results suggest that LULC data can benefit significantly from the use of more sophisticated oversamplers as spectral signatures for the same class can vary according to geographical distribution.

Download Full-text

Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm

Remote Sensing ◽

10.3390/rs11243040 ◽

2019 ◽

Vol 11 (24) ◽

pp. 3040 ◽

Cited By ~ 7

Author(s):

Georgios Douzas ◽

Fernando Bacao ◽

Joao Fonseca ◽

Manvel Khudinyan

Keyword(s):

Remote Sensing ◽

Land Use ◽

Land Cover ◽

Imbalanced Learning ◽

Learning Problem ◽

Automatic Production ◽

Good Resource ◽

New Generation ◽

Land Cover Maps

The automatic production of land use/land cover maps continues to be a challenging problem, with important impacts on the ability to promote sustainability and good resource management. The ability to build robust automatic classifiers and produce accurate maps can have a significant impact on the way we manage and optimize natural resources. The difficulty in achieving these results comes from many different factors, such as data quality and uncertainty. In this paper, we address the imbalanced learning problem, a common and difficult conundrum in remote sensing that affects the quality of classification results, by proposing Geometric-SMOTE, a novel oversampling method, as a tool for addressing the imbalanced learning problem in remote sensing. Geometric-SMOTE is a sophisticated oversampling algorithm which increases the quality of the instances generated in previous methods, such as the synthetic minority oversampling technique. The performance of Geometric- SMOTE, in the LUCAS (Land Use/Cover Area Frame Survey) dataset, is compared to other oversamplers using a variety of classifiers. The results show that Geometric-SMOTE significantly outperforms all the other oversamplers and improves the robustness of the classifiers. These results indicate that, when using imbalanced datasets, remote sensing researchers should consider the use of these new generation oversamplers to increase the quality of the classification results.

Download Full-text

Improving 3-m Resolution Land Cover Mapping through Efficient Learning from an Imperfect 10-m Resolution Map

Remote Sensing ◽

10.3390/rs12091418 ◽

2020 ◽

Vol 12 (9) ◽

pp. 1418

Author(s):

Runmin Dong ◽

Cong Li ◽

Haohuan Fu ◽

Jie Wang ◽

Weijia Li ◽

...

Keyword(s):

Land Cover ◽

Training Data ◽

Training Dataset ◽

Land Cover Mapping ◽

Remotely Sensed Data ◽

Large Area ◽

National Scale ◽

Substantial Progress ◽

Efficient Learning ◽

Land Cover Maps

Substantial progress has been made in the field of large-area land cover mapping as the spatial resolution of remotely sensed data increases. However, a significant amount of human power is still required to label images for training and testing purposes, especially in high-resolution (e.g., 3-m) land cover mapping. In this research, we propose a solution that can produce 3-m resolution land cover maps on a national scale without human efforts being involved. First, using the public 10-m resolution land cover maps as an imperfect training dataset, we propose a deep learning based approach that can effectively transfer the existing knowledge. Then, we improve the efficiency of our method through a network pruning process for national-scale land cover mapping. Our proposed method can take the state-of-the-art 10-m resolution land cover maps (with an accuracy of 81.24% for China) as the training data, enable a transferred learning process that can produce 3-m resolution land cover maps, and further improve the overall accuracy (OA) to 86.34% for China. We present detailed results obtained over three mega cities in China, to demonstrate the effectiveness of our proposed approach for 3-m resolution large-area land cover mapping.

Download Full-text

Temporal integration of remote‐sensing land cover maps to identify crop rotation patterns in a semiarid region of Argentina

Agronomy Journal ◽

10.1002/agj2.20758 ◽

2021 ◽

Author(s):

Antonio M. Aoki ◽

José I. Robledo ◽

Roberto C. Izaurralde ◽

Mónica Balzarini

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Crop Rotation ◽

Temporal Integration ◽

Semiarid Region ◽

Land Cover Maps

Download Full-text

Variations in Subpixel Fire Properties with Season and Land Cover in Southern Africa

Earth Interactions ◽

10.1175/2010ei328.1 ◽

2010 ◽

Vol 14 (6) ◽

pp. 1-29 ◽

Cited By ~ 12

Author(s):

Ted C. Eckmann ◽

Christopher J. Still ◽

Dar A. Roberts ◽

Joel C. Michaelsen

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Southern Africa ◽

Ecological Impacts ◽

Fire Season ◽

Remotely Sensed Data ◽

Radiative Power ◽

Moderate Resolution Imaging Spectroradiometer ◽

Fire Properties ◽

Aerosol Emissions

Abstract Some of the most widely used datasets for monitoring the world’s fires come from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensors aboard NASA’s Terra and Aqua satellites. For virtually all remote sensing systems, including MODIS, pixels that contain fires comprise a mix of burning and nonburning components, each with sizes and temperatures that vary between pixels. Current remote sensing products provide little information about these subpixel components, severely limiting estimates of the gas and aerosol emissions and ecological impacts from the world’s fires. This study shows how multiple endmember spectral mixture analysis (MESMA) can estimate subpixel fire sizes and temperatures from MODIS and can overcome many limitations of existing methods for characterizing fire intensities from remotely sensed data, such as the fire radiative power (FRP) approach. This study used MESMA to estimate subpixel fire sizes and temperatures for MODIS scenes in southern Africa, analyzed how these sizes and temperatures varied with season and land cover, and compared these to analyses made with FRP. This study could be the first to analyze fire sizes and temperatures on a spatial scale as large as a MODIS scene and a temporal scale as large as a full fire season. The variations in MESMA estimates of fire temperature with season and land cover were more consistent than the FRP estimates. Based on these findings, MESMA appears to be more effective than FRP at capturing some variations in fire temperatures, which strongly influence the gas and aerosol emissions from fires, along with their effects on ecosystems.

Download Full-text

Low-cost updating of land-cover maps by classifiying multitemporal remote sensing images

2012 20th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu.2012.6204814 ◽

2012 ◽

Author(s):

Begum Demir ◽

Francesca Bovolo ◽

Lorenzo Bruzzone

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Low Cost ◽

Remote Sensing Images ◽

Multitemporal Remote Sensing ◽

Land Cover Maps

Download Full-text

Classification of various land features using RISAT-1 dual polarimetric data

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-8-833-2014 ◽

2014 ◽

Vol XL-8 ◽

pp. 833-837 ◽

Cited By ~ 10

Author(s):

V. N. Mishra ◽

P. Kumar ◽

D. K. Gupta ◽

R. Prasad

Keyword(s):

Land Use ◽

Land Cover ◽

Agricultural Land ◽

Uttar Pradesh ◽

Land Cover Classification ◽

Support Vector ◽

Land Use Land Cover ◽

Remotely Sensed Data ◽

Sar Data ◽

Land Cover Maps

Land use land cover classification is one of the widely used applications in the field of remote sensing. Accurate land use land cover maps derived from remotely sensed data is a requirement for analyzing many socio-ecological concerns. The present study investigates the capabilities of dual polarimetric C-band SAR data for land use land cover classification. The MRS mode level 1 product of RISAT-1 with dual polarization (HH & HV) covering a part of Varanasi district, Uttar Pradesh, India is analyzed for classifying various land features. In order to increase the amount of information in dual-polarized SAR data, a band HH + HV is introduced to make use of the original two polarizations. Transformed Divergence (TD) procedure for class separability analysis is performed to evaluate the quality of the statistics prior to image classification. For most of the class pairs the TD values are greater than 1.9 which indicates that the classes have good separability. Non-parametric classifier Support Vector Machine (SVM) is used to classify RISAT-1 data with optimized polarization combination into five land use land cover classes like urban land, agricultural land, fallow land, vegetation and water bodies. The overall classification accuracy achieved by SVM is 95.23 % with Kappa coefficient 0.9350.

Download Full-text

STATISTICS FOR PATCH OBSERVATIONS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xli-b6-235-2016 ◽

2016 ◽

Vol XLI-B6 ◽

pp. 235-242

Author(s):

K. L. Hingee

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Land Cover ◽

Tree Canopy ◽

Closed Set ◽

Closed Sets ◽

Random Closed Sets ◽

Contact Distribution ◽

Land Cover Maps ◽

Underlying Processes

In the application of remote sensing it is common to investigate processes that generate patches of material. This is especially true when using categorical land cover or land use maps. Here we view some existing tools, landscape pattern indices (LPI), as non-parametric estimators of random closed sets (RACS). This RACS framework enables LPIs to be studied rigorously. A RACS is any random process that generates a closed set, which encompasses any processes that result in binary (two-class) land cover maps. RACS theory, and methods in the underlying field of stochastic geometry, are particularly well suited to high-resolution remote sensing where objects extend across tens of pixels, and the shapes and orientations of patches are symptomatic of underlying processes. For some LPI this field already contains variance information and border correction techniques. After introducing RACS theory we discuss the core area LPI in detail. It is closely related to the spherical contact distribution leading to conditional variants, a new version of contagion, variance information and multiple border-corrected estimators. We demonstrate some of these findings on high resolution tree canopy data.

Download Full-text

Assess citizen science based land cover maps with remote sensing products: the Ground Truth 2.0 data quality tool

Eighth International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2020) ◽

10.1117/12.2570814 ◽

2020 ◽

Cited By ~ 1

Author(s):

Joan Maso ◽

Nuria Julia ◽

Alaitz Zabala ◽

Ester Prat ◽

Johannes v. Kwast ◽

...

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Data Quality ◽

Citizen Science ◽

Ground Truth ◽

Land Cover Maps ◽

Quality Tool

Download Full-text

Towards an Improved Inventory of N2O Emissions Using Land Cover Maps Derived from Optical Remote Sensing Images

IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2018.8519272 ◽

2018 ◽

Author(s):

Tiphaine Tallec ◽

Claire Marais Siere ◽

Remy Fieuzal

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Land Cover Maps

Download Full-text

Spatio-Temporal Sub-Pixel Land Cover Mapping of Remote Sensing Imagery Using Spatial Distribution Information From Same-Class Pixels

Remote Sensing ◽

10.3390/rs12030503 ◽

2020 ◽

Vol 12 (3) ◽

pp. 503

Author(s):

Li ◽

Chen ◽

Foody ◽

Wang ◽

Yang ◽

...

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Spatial Resolution ◽

Temporal Frequency ◽

Land Cover Mapping ◽

Remote Sensing Images ◽

Ancillary Data ◽

Data Set ◽

Spatio Temporal ◽

Land Cover Maps

The generation of land cover maps with both fine spatial and temporal resolution would aid the monitoring of change on the Earth’s surface. Spatio-temporal sub-pixel land cover mapping (STSPM) uses a few fine spatial resolution (FR) maps and a time series of coarse spatial resolution (CR) remote sensing images as input to generate FR land cover maps with a temporal frequency of the CR data set. Traditional STSPM selects spatially adjacent FR pixels within a local window as neighborhoods to model the land cover spatial dependence, which can be a source of error and uncertainty in the maps generated by the analysis. This paper proposes a new STSPM using FR remote sensing images that pre- and/or post-date the CR image as ancillary data to enhance the quality of the FR map outputs. Spectrally similar pixels within the locality of a target FR pixel in the ancillary data are likely to represent the same land cover class and hence such same-class pixels can provide spatial information to aid the analysis. Experimental results showed that the proposed STSPM predicted land cover maps more accurately than two comparative state-of-the-art STSPM algorithms.

Download Full-text