Automatic Extraction and Filtering of OpenStreetMap Data to Generate Training Datasets for Land Use Land Cover Classification

Cidália C. Fonte; Joaquim Patriarca; Ismael Jesus; Diogo Duarte

doi:10.3390/rs12203428

Automatic Extraction and Filtering of OpenStreetMap Data to Generate Training Datasets for Land Use Land Cover Classification

Remote Sensing ◽

10.3390/rs12203428 ◽

2020 ◽

Vol 12 (20) ◽

pp. 3428

Author(s):

Cidália C. Fonte ◽

Joaquim Patriarca ◽

Ismael Jesus ◽

Diogo Duarte

Keyword(s):

Land Use ◽

Land Cover ◽

Software Package ◽

Random Forest Classifier ◽

Training Data ◽

Automatic Extraction ◽

Land Use Land Cover ◽

Mixed Pixels ◽

Different Characteristics ◽

Sentinel 2

This paper tests an automated methodology for generating training data from OpenStreetMap (OSM) to classify Sentinel-2 imagery into Land Use/Land Cover (LULC) classes. Different sets of training data were generated and used as inputs for the image classification. Firstly, OSM data was converted into LULC maps using the OSM2LULC_4T software package. The Random Forest classifier was then trained to classify a time-series of Sentinel-2 imagery into 8 LULC classes with samples extracted from: (1) The LULC maps produced by OSM2LULC_4T (TD0); (2) the TD1 dataset, obtained after removing mixed pixels from TD0; (3) the TD2 dataset, obtained by filtering TD1 using radiometric indices. The classification results were generalized using a majority filter and hybrid maps were created by merging the classification results with the OSM2LULC outputs. The accuracy of all generated maps was assessed using the 2018 official “Carta de Ocupação do Solo” (COS). The methodology was applied to two study areas with different characteristics. The results show that in some cases the filtering procedures improve the training data and the classification results. This automated methodology allowed the production of maps with overall accuracy between 55% and 78% greater than that of COS, even though the used nomenclature includes classes that can be easily confused by the classifiers.

Download Full-text

Machine Learning-Based Processing Proof-of-Concept Pipeline for Semi-Automatic Sentinel-2 Imagery Download, Cloudiness Filtering, Classifications, and Updates of Open Land Use/Land Cover Datasets

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020102 ◽

2021 ◽

Vol 10 (2) ◽

pp. 102

Author(s):

Tomáš Řezník ◽

Jan Chytrý ◽

Kateřina Trojanová

Keyword(s):

Machine Learning ◽

Land Use ◽

Land Cover ◽

Training Data ◽

External Information ◽

Land Use Land Cover ◽

Proof Of Concept ◽

Area Of Interest ◽

Open Land ◽

Sentinel 2

Land use and land cover are continuously changing in today’s world. Both domains, therefore, have to rely on updates of external information sources from which the relevant land use/land cover (classification) is extracted. Satellite images are frequent candidates due to their temporal and spatial resolution. On the contrary, the extraction of relevant land use/land cover information is demanding in terms of knowledge base and time. The presented approach offers a proof-of-concept machine-learning pipeline that takes care of the entire complex process in the following manner. The relevant Sentinel-2 images are obtained through the pipeline. Later, cloud masking is performed, including the linear interpolation of merged-feature time frames. Subsequently, four-dimensional arrays are created with all potential training data to become a basis for estimators from the scikit-learn library; the LightGBM estimator is then used. Finally, the classified content is applied to the open land use and open land cover databases. The verification of the provided experiment was conducted against detailed cadastral data, to which Shannon’s entropy was applied since the number of cadaster information classes was naturally consistent. The experiment showed a good overall accuracy (OA) of 85.9%. It yielded a classified land use/land cover map of the study area consisting of 7188 km2 in the southern part of the South Moravian Region in the Czech Republic. The developed proof-of-concept machine-learning pipeline is replicable to any other area of interest so far as the requirements for input data are met.

Download Full-text

Tracking the Land Use/Land Cover Change in an Area with Underground Mining and Reforestation via Continuous Landsat Classification

Remote Sensing ◽

10.3390/rs11141719 ◽

2019 ◽

Vol 11 (14) ◽

pp. 1719 ◽

Cited By ~ 7

Author(s):

Jiaxin Mi ◽

Yongjun Yang ◽

Shaoliang Zhang ◽

Shi An ◽

Huping Hou ◽

...

Keyword(s):

Land Use ◽

Random Forest ◽

Land Cover ◽

Large Scale ◽

Underground Mining ◽

Mining Area ◽

Random Forest Classifier ◽

Land Use Land Cover ◽

Lulc Change ◽

Mining Areas

Understanding the changes in a land use/land cover (LULC) is important for environmental assessment and land management. However, tracking the dynamic of LULC has proved difficult, especially in large-scale underground mining areas with extensive LULC heterogeneity and a history of multiple disturbances. Additional research related to the methods in this field is still needed. In this study, we tracked the LULC change in the Nanjiao mining area, Shanxi Province, China between 1987 and 2017 via random forest classifier and continuous Landsat imagery, where years of underground mining and reforestation projects have occurred. We applied a Savitzky–Golay filter and a normalized difference vegetation index (NDVI)-based approach to detect the temporal and spatial change, respectively. The accuracy assessment shows that the random forest classifier has a good performance in this heterogeneous area, with an accuracy ranging from 81.92% to 86.6%, which is also higher than that via support vector machine (SVM), neural network (NN), and maximum likelihood (ML) algorithm. LULC classification results reveal that cultivated forest in the mining area increased significantly after 2004, while the spatial extent of natural forest, buildings, and farmland decreased significantly after 2007. The areas where vegetation was significantly reduced were mainly because of the transformation from natural forest and shrubs into grasslands and bare lands, respectively, whereas the areas with an obvious increase in NDVI were mainly because of the conversion from grasslands and buildings into cultivated forest, especially when villages were abandoned after mining subsidence. A partial correlation analysis demonstrated that the extent of LULC change was significantly related to coal production and reforestation, which indicated the effects of underground mining and reforestation projects on LULC changes. This study suggests that continuous Landsat classification via random forest classifier could be effective in monitoring the long-term dynamics of LULC changes, and provide crucial information and data for the understanding of the driving forces of LULC change, environmental impact assessment, and ecological protection planning in large-scale mining areas.

Download Full-text

Spatial and semantic effects of LUCAS samples on fully automated land use/land cover classification in high-resolution Sentinel-2 data

International Journal of Applied Earth Observation and Geoinformation ◽

10.1016/j.jag.2020.102065 ◽

2020 ◽

Vol 88 ◽

pp. 102065 ◽

Cited By ~ 10

Author(s):

Matthias Weigand ◽

Jeroen Staab ◽

Michael Wurm ◽

Hannes Taubenböck

Keyword(s):

Land Use ◽

High Resolution ◽

Land Cover ◽

Land Cover Classification ◽

Land Use Land Cover ◽

Sentinel 2

Download Full-text

Quantitative Analysis of Different Environmental Factor Impacts on Land Cover in Nisos Elafonisos, Crete, Greece

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17186437 ◽

2020 ◽

Vol 17 (18) ◽

pp. 6437

Author(s):

Mohamed Elhag ◽

Silvena Boteva

Keyword(s):

Land Use ◽

Environmental Factors ◽

Land Cover ◽

Remote Sensing Data ◽

Mediterranean Ecosystems ◽

Land Use Land Cover ◽

Routine Work ◽

Digital Elevation ◽

Elevation Model ◽

Sentinel 2

Land Cover monitoring is an essential task for a better understanding of the ecosystem’s dynamicity and complexity. The availability of Remote Sensing data improved the Land Use Land Cover mapping as it is routine work in ecosystem management. The complexity of the Mediterranean ecosystems involves a complexity of the surrounding environmental factors. An attempt to quantitatively investigate the interdependencies between land covers and affected environmental factors was conducted in Nisos Elafonisos to represent diverse and fragile coastal Mediterranean ecosystems. Sentinel-2 (MSI) sensor and ASTER Digital Elevation Model (DEM) data were used to classify the LULC as well as to draw different vegetation conditions over the designated study area. DEM derivatives were conducted and incorporated. The developed methodology is intended to assess the land use land cover for different practices under the present environmental condition of Nisos Elafonisos. Supervised classification resulted in six different land cover clusters and was tested against three different environmental clusters. The findings of the current research pointed out that the environmental variables are independent and there is a vertical distribution of the vegetation according to altitude.

Download Full-text

Global land use / land cover with Sentinel 2 and deep learning

10.1109/igarss47720.2021.9553499 ◽

2021 ◽

Author(s):

Krishna Karra ◽

Caitlin Kontgis ◽

Zoe Statman-Weil ◽

Joseph C. Mazzariello ◽

Mark Mathis ◽

...

Keyword(s):

Land Use ◽

Deep Learning ◽

Land Cover ◽

Land Use Land Cover ◽

Global Land ◽

Sentinel 2

Download Full-text

Use of Sentinel-2 and LUCAS Database for the Inventory of Land Use, Land Use Change, and Forestry in Wallonia, Belgium

Land ◽

10.3390/land7040154 ◽

2018 ◽

Vol 7 (4) ◽

pp. 154 ◽

Cited By ~ 6

Author(s):

Odile Close ◽

Beaumont Benjamin ◽

Sophie Petit ◽

Xavier Fripiat ◽

Eric Hallot

Keyword(s):

Land Use ◽

Land Use Change ◽

Land Cover ◽

Greenhouse Gas ◽

Nearest Neighbor ◽

Training Sample ◽

Training Data ◽

K Nearest Neighbor ◽

Statistical Survey ◽

Sentinel 2

Due to its cost-effectiveness and repeatability of observations, high resolution optical satellite remote sensing has become a major technology for land use and land cover mapping. However, inventory compilers for the Land Use, Land Use Change, and Forestry (LULUCF) sector are still mostly relying on annual census and periodic surveys for such inventories. This study proposes a new approach based on per-pixel supervised classification using Sentinel-2 imagery from 2016 for mapping greenhouse gas emissions and removals associated with the LULUCF sector in Wallonia, Belgium. The Land Use/Cover Area frame statistical Survey (LUCAS) of 2015 was used as training data and reference data to validate the map produced. Then, we investigated the performance of four widely used classifiers (maximum likelihood, random forest, k-nearest neighbor, and minimum distance) on different training sample sizes. We also studied the use of the rich spectral information of Sentinel-2 data as well as single-date and multitemporal classification. Our study illustrates how open source data can be effectively used for land use and land cover classification. This classification, based on Sentinel-2 and LUCAS, offers new opportunities for LULUCF inventory of greenhouse gas on a European scale.

Download Full-text

Performance evaluation of MLE, RF and SVM classification algorithms for watershed scale land use/land cover mapping using sentinel 2 bands

Remote Sensing Applications Society and Environment ◽

10.1016/j.rsase.2020.100351 ◽

2020 ◽

Vol 19 ◽

pp. 100351 ◽

Cited By ~ 1

Author(s):

Vikas Kumar Rana ◽

Tallavajhala Maruthi Venkata Suryanarayana

Keyword(s):

Land Use ◽

Performance Evaluation ◽

Land Cover ◽

Land Cover Mapping ◽

Classification Algorithms ◽

Land Use Land Cover ◽

Watershed Scale ◽

Svm Classification ◽

Sentinel 2

Download Full-text

PIXEL-BASED CLASSIFICATION ANALYSIS OF LAND USE LAND COVER USING SENTINEL-2 AND LANDSAT-8 DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-w6-91-2017 ◽

2017 ◽

Vol XLII-4/W6 ◽

pp. 91-93 ◽

Cited By ~ 11

Author(s):

A. Sekertekin ◽

A. M. Marangoz ◽

H. Akcin

Keyword(s):

Land Use ◽

Land Cover ◽

Supervised Classification ◽

Accuracy Assessment ◽

Kappa Statistics ◽

Landsat 8 ◽

Land Use Land Cover ◽

Landsat 8 Oli ◽

Sentinel 2

The aim of this study is to conduct accuracy analyses of Land Use Land Cover (LULC) classifications derived from Sentinel-2 and Landsat-8 data, and to reveal which dataset present better accuracy results. Zonguldak city and its near surrounding was selected as study area for this case study. Sentinel-2 Multispectral Instrument (MSI) and Landsat-8 the Operational Land Imager (OLI) data, acquired on 6 April 2016 and 3 April 2016 respectively, were utilized as satellite imagery in the study. The RGB and NIR bands of Sentinel-2 and Landsat-8 were used for classification and comparison. Pan-sharpening process was carried out for Landsat-8 data before classification because the spatial resolution of Landsat-8 (30m) is far from Sentinel-2 RGB and NIR bands (10m). LULC images were generated using pixel-based Maximum Likelihood (MLC) supervised classification method. As a result of the accuracy assessment, kappa statistics for Sentinel-2 and Landsat-8 data were 0.78 and 0.85 respectively. The obtained results showed that Sentinel-2 MSI presents more satisfying LULC images than Landsat-8 OLI data. However, in some areas of Sea class Landsat-8 presented better results than Sentinel-2.

Download Full-text

ASSESSING THE ACCURACY OF LAND USE LAND COVER (LULC) MAPS USING CLASS PROPORTIONS IN THE REFERENCE DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-3-2020-669-2020 ◽

2020 ◽

Vol V-3-2020 ◽

pp. 669-674 ◽

Cited By ~ 1

Author(s):

C. C. Fonte ◽

L. See ◽

J. C. Laso-Bayas ◽

M. Lesiv ◽

S. Fritz

Keyword(s):

Land Use ◽

Land Cover ◽

Reference Data ◽

Accuracy Assessment ◽

Reference Database ◽

Data Sets ◽

Land Use Land Cover ◽

Data Set ◽

Mixed Pixels ◽

Land Cover Map

Abstract. Traditionally the accuracy assessment of a hard raster-based land use land cover (LULC) map uses a reference data set that contains one LULC class per pixel, which is the class that has the largest area in each pixel. However, when mixed pixels exist in the reference data, this is a simplification of reality that has implications for both the accuracy assessment and subsequent applications of LULC maps, such as area estimation. This paper demonstrates how the use of class proportions in the reference data set can be used easily within regular accuracy assessment procedures and how the use of class proportions can affect the final accuracy assessment. Using the CORINE land cover map (CLC) and the more detailed Urban Atlas (UA), two accuracy assessments of the raster version of CLC were undertaken using UA as the reference and considering for each pixel: (i) the class proportions retained from the UA; and (ii) the class with the majority area. The results show that for the study area and the classes considered here, all accuracy indices decrease when the class proportions are considered in the reference database, achieving a maximum difference of 16% between the two approaches. This demonstrates that if the UA is considered as representing reality, then the true accuracy of CLC is lower than the value obtained when using the reference data set that assigns only one class to each pixel. Arguments for and against using class proportions in reference data sets are then provided and discussed.

Download Full-text

EVALUATION OF SENTINEL-2 MSI DATA FOR LAND USE / LAND COVER CLASSIFICATION USING DIFFERENT VEGETATION INDICES

Selcuk University Journal of Engineering Science and Technology ◽

10.15317/scitech.2018.174 ◽

2018 ◽

Vol 6 (Özel (Special)) ◽

pp. 839-846 ◽

Cited By ~ 1

Author(s):

Filiz BEKTAŞ BALÇIK

Keyword(s):

Land Use ◽

Land Cover ◽

Vegetation Indices ◽

Land Cover Classification ◽

Land Use Land Cover ◽

Sentinel 2

Download Full-text