The Effect of Synergistic Approaches of Features and Ensemble Learning Algorith on Aboveground Biomass Estimation of Natural Secondary Forests Based on ALS and Landsat 8

Chunyu Du; Wenyi Fan; Ye Ma; Hung-Il Jin; Zhen Zhen

doi:10.3390/s21175974

The Effect of Synergistic Approaches of Features and Ensemble Learning Algorith on Aboveground Biomass Estimation of Natural Secondary Forests Based on ALS and Landsat 8

Sensors ◽

10.3390/s21175974 ◽

2021 ◽

Vol 21 (17) ◽

pp. 5974

Author(s):

Chunyu Du ◽

Wenyi Fan ◽

Ye Ma ◽

Hung-Il Jin ◽

Zhen Zhen

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Aboveground Biomass ◽

Laser Scanning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Biomass Estimation ◽

Landsat 8 ◽

Secondary Forests ◽

Two Factors

Although the combination of Airborne Laser Scanning (ALS) data and optical imagery and machine learning algorithms were proved to improve the estimation of aboveground biomass (AGB), the synergistic approaches of different data and ensemble learning algorithms have not been fully investigated, especially for natural secondary forests (NSFs) with complex structures. This study aimed to explore the effects of the two factors on AGB estimation of NSFs based on ALS data and Landsat 8 imagery. The synergistic method of extracting novel features (i.e., COLI1 and COLI2) using optimal Landsat 8 features and the best-performing ALS feature (i.e., elevation mean) yielded higher accuracy of AGB estimation than either optical-only or ALS-only features. However, both of them failed to improve the accuracy compared to the simple combination of the untransformed features that generated them. The convolutional neural networks (CNN) model was much superior to other classic machine learning algorithms no matter of features. The stacked generalization (SG) algorithms, a kind of ensemble learning algorithms, greatly improved the accuracies compared to the corresponding base model, and the SG with the CNN meta-model performed best. This study provides technical support for a wall-to-wall AGB mapping of NSFs of northeastern China using efficient features and algorithms.

Download Full-text

Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation Using Machine Learning Algorithms

Forests ◽

10.3390/f10121073 ◽

2019 ◽

Vol 10 (12) ◽

pp. 1073 ◽

Cited By ~ 10

Author(s):

Li ◽

Liu

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Variable Selection ◽

Aboveground Biomass ◽

Forest Type ◽

Learning Algorithms ◽

Forest Biomass ◽

Machine Learning Algorithms ◽

Biomass Estimation ◽

Landsat 8

Forest biomass is a major store of carbon and plays a crucial role in the regional and global carbon cycle. Accurate forest biomass assessment is important for monitoring and mapping the status of and changes in forests. However, while remote sensing-based forest biomass estimation in general is well developed and extensively used, improving the accuracy of biomass estimation remains challenging. In this paper, we used China’s National Forest Continuous Inventory data and Landsat 8 Operational Land Imager data in combination with three algorithms, either the linear regression (LR), random forest (RF), or extreme gradient boosting (XGBoost), to establish biomass estimation models based on forest type. In the modeling process, two methods of variable selection, e.g., stepwise regression and variable importance-base method, were used to select optimal variable subsets for LR and machine learning algorithms (e.g., RF and XGBoost), respectively. Comfortingly, the accuracy of models was significantly improved, and thus the following conclusions were drawn: (1) Variable selection is very important for improving the performance of models, especially for machine learning algorithms, and the influence of variable selection on XGBoost is significantly greater than that of RF. (2) Machine learning algorithms have advantages in aboveground biomass (AGB) estimation, and the XGBoost and RF models significantly improved the estimation accuracy compared with the LR models. Despite that the problems of overestimation and underestimation were not fully eliminated, the XGBoost algorithm worked well and reduced these problems to a certain extent. (3) The approach of AGB modeling based on forest type is a very advantageous method for improving the performance at the lower and higher values of AGB. Some conclusions in this paper were probably different as the study area changed. The methods used in this paper provide an optional and useful approach for improving the accuracy of AGB estimation based on remote sensing data, and the estimation of AGB was a reference basis for monitoring the forest ecosystem of the study area.

Download Full-text

Estimation of Individual Tree Biomass in Natural Secondary Forests Based on ALS Data and WorldView-3 Imagery

Remote Sensing ◽

10.3390/rs14020271 ◽

2022 ◽

Vol 14 (2) ◽

pp. 271

Author(s):

Yinghui Zhao ◽

Ye Ma ◽

Lindi Quackenbush ◽

Zhen Zhen

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Tree Species ◽

Laser Scanning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Secondary Forests ◽

Species Classification ◽

Individual Tree ◽

Tree Species Classification

Individual-tree aboveground biomass (AGB) estimation can highlight the spatial distribution of AGB and is vital for precision forestry. Accurately estimating individual tree AGB is a requisite for accurate forest carbon stock assessment of natural secondary forests (NSFs). In this study, we investigated the performance of three machine learning and three ensemble learning algorithms in tree species classification based on airborne laser scanning (ALS) and WorldView-3 imagery, inversed the diameter at breast height (DBH) using an optimal tree height curve model, and mapped individual tree AGB for a site in northeast China using additive biomass equations, tree species, and inversed DBH. The results showed that the combination of ALS and WorldView-3 performed better than either single data source in tree species classification, and ensemble learning algorithms outperformed machine learning algorithms (except CNN). Seven tree species had satisfactory accuracy of individual tree AGB estimation, with R2 values ranging from 0.68 to 0.85 and RMSE ranging from 7.47 kg to 36.83kg. The average individual tree AGB was 125.32 kg and the forest AGB was 113.58 Mg/ha in the Maoershan study site in Heilongjiang Province, China. This study provides a way to classify tree species and estimate individual tree AGB of NSFs based on ALS data and WorldView-3 imagery.

Download Full-text

Forest aboveground biomass estimation using Landsat 8 and Sentinel-1A data with machine learning algorithms

Scientific Reports ◽

10.1038/s41598-020-67024-3 ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 3

Author(s):

Yingchang Li ◽

Mingyang Li ◽

Chao Li ◽

Zhenzhen Liu

Keyword(s):

Machine Learning ◽

Aboveground Biomass ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Biomass Estimation ◽

Landsat 8 ◽

Forest Aboveground Biomass

Download Full-text

Aboveground biomass estimation using multi-sensor data synergy and machine learning algorithms in a dense tropical forest

Applied Geography ◽

10.1016/j.apgeog.2018.05.011 ◽

2018 ◽

Vol 96 ◽

pp. 29-40 ◽

Cited By ~ 29

Author(s):

Sujit Madhab Ghosh ◽

Mukunda Dev Behera

Keyword(s):

Machine Learning ◽

Tropical Forest ◽

Aboveground Biomass ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Sensor Data ◽

Biomass Estimation ◽

Data Synergy

Download Full-text

Modeling wetland aboveground biomass in the Poyang Lake National Nature Reserve using machine learning algorithms and Landsat-8 imagery

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.12.046029 ◽

2018 ◽

Vol 12 (04) ◽

pp. 1 ◽

Cited By ~ 2

Author(s):

Rongrong Wan ◽

Peng Wang ◽

Xiaolong Wang

Keyword(s):

Machine Learning ◽

Aboveground Biomass ◽

Nature Reserve ◽

Poyang Lake ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Landsat 8 ◽

National Nature Reserve

Download Full-text

Estimating Forest Aboveground Biomass Using Gaofen-1 Images, Sentinel-1 Images, and Machine Learning Algorithms: A Case Study of the Dabie Mountain Region, China

Remote Sensing ◽

10.3390/rs14010176 ◽

2021 ◽

Vol 14 (1) ◽

pp. 176

Author(s):

Haoshuang Han ◽

Rongrong Wan ◽

Bing Li

Keyword(s):

Machine Learning ◽

Aboveground Biomass ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Mountain Region ◽

Stepwise Multiple Regression ◽

Support Vector ◽

Biomass Estimation ◽

Dabie Mountain ◽

Forest Aboveground Biomass

Quantitatively mapping forest aboveground biomass (AGB) is of great significance for the study of terrestrial carbon storage and global carbon cycles, and remote sensing-based data are a valuable source of estimating forest AGB. In this study, we evaluated the potential of machine learning algorithms (MLAs) by integrating Gaofen-1 (GF1) images, Sentinel-1 (S1) images, and topographic data for AGB estimation in the Dabie Mountain region, China. Variables extracted from GF1 and S1 images and digital elevation model data from sample plots were used to explain the field AGB value variations. The prediction capability of stepwise multiple regression and three MLAs, i.e., support vector machine (SVM), random forest (RF), and backpropagation neural network were compared. The results showed that the RF model achieved the highest prediction accuracy (R2 = 0.70, RMSE = 16.26 t/ha), followed by the SVM model (R2 = 0.66, RMSE = 18.03 t/ha) for the testing datasets. Some variables extracted from the GF1 images (e.g., normalized differential vegetation index, band 1-blue, the mean texture feature of band 3-red with windows of 3 × 3), S1 images (e.g., vertical transmit-horizontal receive and vertical transmit-vertical receive backscatter coefficient), and altitude had strong correlations with field AGB values (p < 0.01). Among the explanatory variables in MLAs, variables extracted from GF1 made a greater contribution to estimating forest AGB than those derived from S1 images. These results indicate the potential of the RF model for evaluating forest AGB by combining GF1 and S1, and that it could provide a reference for biomass estimation using multi-source images.

Download Full-text

Mapping Allochemical Limestone Formations in Hazara, Pakistan Using Google Cloud Architecture: Application of Machine-Learning Algorithms on Multispectral Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020058 ◽

2021 ◽

Vol 10 (2) ◽

pp. 58

Author(s):

Muhammad Fawad Akbar Khan ◽

Khan Muhammad ◽

Shahid Bashir ◽

Shahab Ud Din ◽

Muhammad Hanif

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Kappa Coefficient ◽

Machine Learning Algorithms ◽

Landsat 8 ◽

Sensing Data ◽

Fossiliferous Limestone

Low-resolution Geological Survey of Pakistan (GSP) maps surrounding the region of interest show oolitic and fossiliferous limestone occurrences correspondingly in Samanasuk, Lockhart, and Margalla hill formations in the Hazara division, Pakistan. Machine-learning algorithms (MLAs) have been rarely applied to multispectral remote sensing data for differentiating between limestone formations formed due to different depositional environments, such as oolitic or fossiliferous. Unlike the previous studies that mostly report lithological classification of rock types having different chemical compositions by the MLAs, this paper aimed to investigate MLAs’ potential for mapping subclasses within the same lithology, i.e., limestone. Additionally, selecting appropriate data labels, training algorithms, hyperparameters, and remote sensing data sources were also investigated while applying these MLAs. In this paper, first, oolitic (Samanasuk), fossiliferous (Lockhart and Margalla) limestone-bearing formations along with the adjoining Hazara formation were mapped using random forest (RF), support vector machine (SVM), classification and regression tree (CART), and naïve Bayes (NB) MLAs. The RF algorithm reported the best accuracy of 83.28% and a Kappa coefficient of 0.78. To further improve the targeted allochemical limestone formation map, annotation labels were generated by the fusion of maps obtained from principal component analysis (PCA), decorrelation stretching (DS), X-means clustering applied to ASTER-L1T, Landsat-8, and Sentinel-2 datasets. These labels were used to train and validate SVM, CART, NB, and RF MLAs to obtain a binary classification map of limestone occurrences in the Hazara division, Pakistan using the Google Earth Engine (GEE) platform. The classification of Landsat-8 data by CART reported 99.63% accuracy, with a Kappa coefficient of 0.99, and was in good agreement with the field validation. This binary limestone map was further classified into oolitic (Samanasuk) and fossiliferous (Lockhart and Margalla) formations by all the four MLAs; in this case, RF surpassed all the other algorithms with an improved accuracy of 96.36%. This improvement can be attributed to better annotation, resulting in a binary limestone classification map, which formed a mask for improved classification of oolitic and fossiliferous limestone in the area.

Download Full-text

A COMPARISON OF MACHINE-LEARNING REGRESSION ALGORITHMS FOR THE ESTIMATION OF LAI USING LANDSAT - 8 SATELLITE DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-w16-679-2019 ◽

2019 ◽

Vol XLII-4/W16 ◽

pp. 679-683

Author(s):

V. P. Yadav ◽

R. Prasad ◽

R. Bala ◽

A. K. Vishwakarma ◽

S. A. Yadav ◽

...

Keyword(s):

Machine Learning ◽

Satellite Data ◽

Vegetation Index ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Accurate Estimation ◽

Support Vector ◽

Landsat 8 ◽

Area Index ◽

Global Circulation Models

Abstract. The leaf area index (LAI) is one of key variable of crops which plays important role in agriculture, ecology and climate change for global circulation models to compute energy and water fluxes. In the recent research era, the machine-learning algorithms have provided accurate computational approaches for the estimation of crops biophysical parameters using remotely sensed data. The three machine-learning algorithms, random forest regression (RFR), support vector regression (SVR) and artificial neural network regression (ANNR) were used to estimate the LAI for crops in the present study. The three different dates of Landsat-8 satellite images were used during January 2017 – March 2017 at different crops growth conditions in Varanasi district, India. The sampling regions were fully covered by major Rabi season crops like wheat, barley and mustard etc. In total pooled data, 60% samples were taken for the training of the algorithms and rest 40% samples were taken as testing and validation of the machinelearning regressions algorithms. The highest sensitivity of normalized difference vegetation index (NDVI) with LAI was found using RFR algorithms (R2 = 0.884, RMSE = 0.404) as compared to SVR (R2 = 0.847, RMSE = 0.478) and ANNR (R2 = 0.829, RMSE = 0.404). Therefore, RFR algorithms can be used for accurate estimation of LAI for crops using satellite data.

Download Full-text

FEASIBILITY OF MACHINE LEARNING METHODS FOR SEPARATING WOOD AND LEAF POINTS FROM TERRESTRIAL LASER SCANNING DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-2-w4-157-2017 ◽

2017 ◽

Vol IV-2/W4 ◽

pp. 157-164 ◽

Cited By ~ 6

Author(s):

D. Wang ◽

M. Hollaus ◽

N. Pfeifer

Keyword(s):

Machine Learning ◽

Laser Scanning ◽

Learning Algorithms ◽

Terrestrial Laser Scanning ◽

Gaussian Mixture ◽

Machine Learning Algorithms ◽

Support Vector ◽

Area Index ◽

Training Samples

Classification of wood and leaf components of trees is an essential prerequisite for deriving vital tree attributes, such as wood mass, leaf area index (LAI) and woody-to-total area. Laser scanning emerges to be a promising solution for such a request. Intensity based approaches are widely proposed, as different components of a tree can feature discriminatory optical properties at the operating wavelengths of a sensor system. For geometry based methods, machine learning algorithms are often used to separate wood and leaf points, by providing proper training samples. However, it remains unclear how the chosen machine learning classifier and features used would influence classification results. To this purpose, we compare four popular machine learning classifiers, namely Support Vector Machine (SVM), Na¨ıve Bayes (NB), Random Forest (RF), and Gaussian Mixture Model (GMM), for separating wood and leaf points from terrestrial laser scanning (TLS) data. Two trees, an <i>Erytrophleum fordii</i> and a <i>Betula pendula</i> (silver birch) are used to test the impacts from classifier, feature set, and training samples. Our results showed that RF is the best model in terms of accuracy, and local density related features are important. Experimental results confirmed the feasibility of machine learning algorithms for the reliable classification of wood and leaf points. It is also noted that our studies are based on isolated trees. Further tests should be performed on more tree species and data from more complex environments.

Download Full-text

Comparative Analysis of Machine Learning Algorithms in Automatic Identification and Extraction of Water Boundaries

Applied Sciences ◽

10.3390/app112110062 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10062

Author(s):

Aimin Li ◽

Meng Fan ◽

Guangduo Qin ◽

Youcheng Xu ◽

Hailong Wang

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Water Bodies ◽

Support Vector ◽

Landsat 8 ◽

Transfer Performance ◽

Remote Sensing Images

Monitoring open water bodies accurately is important for assessing the role of ecosystem services in the context of human survival and climate change. There are many methods available for water body extraction based on remote sensing images, such as the normalized difference water index (NDWI), modified NDWI (MNDWI), and machine learning algorithms. Based on Landsat-8 remote sensing images, this study focuses on the effects of six machine learning algorithms and three threshold methods used to extract water bodies, evaluates the transfer performance of models applied to remote sensing images in different periods, and compares the differences among these models. The results are as follows. (1) Various algorithms require different numbers of samples to reach their optimal consequence. The logistic regression algorithm requires a minimum of 110 samples. As the number of samples increases, the order of the optimal model is support vector machine, neural network, random forest, decision tree, and XGBoost. (2) The accuracy evaluation performance of each machine learning on the test set cannot represent the local area performance. (3) When these models are directly applied to remote sensing images in different periods, the AUC indicators of each machine learning algorithm for three regions all show a significant decline, with a decrease range of 0.33–66.52%, and the differences among the different algorithm performances in the three areas are obvious. Generally, the decision tree algorithm has good transfer performance among the machine learning algorithms with area under curve (AUC) indexes of 0.790, 0.518, and 0.697 in the three areas, respectively, and the average value is 0.668. The Otsu threshold algorithm is the optimal among threshold methods, with AUC indexes of 0.970, 0.617, and 0.908 in the three regions respectively and an average AUC of 0.832.

Download Full-text