Bayesian model predicts the aboveground biomass of Caragana microphylla in sandy lands better than OLS regression models

2020 ◽  
Vol 13 (6) ◽  
pp. 732-737
Author(s):  
Yi Tang ◽  
Arshad Ali ◽  
Li-Huan Feng

Abstract Aims In forest ecosystems, different types of regression models have been frequently used for the estimation of aboveground biomass, where Ordinary Least Squares (OLS) regression models are the most common prediction models. Yet, the relative performance of Bayesian and OLS models in predicting aboveground biomass of shrubs, especially multi-stem shrubs, has relatively been less studied in forests. Methods In this study, we developed the biomass prediction models for Caragana microphylla Lam. which is a widely distributed multi-stems shrub, and contributes to the decrease of wind erosion and the fixation of sand dunes in the Horqin Sand Land, one of the largest sand lands in China. We developed six types of formulations under the framework of the regression models, and then, selected the best model based on specific criteria. Consequently, we estimated the parameters of the best model with OLS and Bayesian methods with training and test data under different sample sizes with the bootstrap method. Lastly, we compared the performance of the OLS and Bayesian models in predicting the aboveground biomass of C. microphylla. Important Findings The performance of the allometric equation (power = 1) was best among six types of equations, even though all of those models were significant. The results showed that mean squared error of test data with non-informative prior Bayesian method and the informative prior Bayesian method was lower than with the OLS method. Among the tested predictors (i.e. plant height and basal diameter), we found that basal diameter was not a significant predictor either in OLS or Bayesian methods, indicating that suitable predictors and well-fitted models should be seriously considered. This study highlights that Bayesian methods, the bootstrap method and the type of allometric equation could help to improve the model accuracy in predicting shrub biomass in sandy lands.

1980 ◽  
Vol 10 (3) ◽  
pp. 367-370 ◽  
Author(s):  
T. R. Crow ◽  
P. R. Laidly

A number of untransformed regression models were compared to the log-log form of the allometric function for estimating biomass. Models were evaluated using two tree species, Betulapapyrifera Marsh, and Pinusresinosa Ait., and one tall shrub, Ilexverticillata (L.) Gray, with total aboveground biomass as the dependent variable. Using goodness of fit as the criterion, both weighted linear and weighted nonlinear models proved to be acceptable alternatives to the transformed allometric equation. Weighted models retain the advantage of the log-log form, i.e., compatibility with the homogeneity of variance assumption, but avoid the transformational bias.


Energies ◽  
2019 ◽  
Vol 12 (23) ◽  
pp. 4434
Author(s):  
Damià Palmer ◽  
Josep O. Pou ◽  
L. Gonzalez-Sabaté ◽  
Jordi Díaz-Ferrero ◽  
Juan A. Conesa ◽  
...  

In order to reduce the calculation effort during the simulation of the emission of polychlorinated dibenzo-p-dioxins and furans (PCDD/F) during municipal solid waste incineration, minimizing the number of simulated components is mandatory. For this purpose, two new multilinear regression models capable of determining the dioxins total amount and toxicity of an atmospheric emission have been adjusted based on previously published ones. The new source of data used (almost 200 PCDD/F analyses) provides a wider range of application to the models, increasing also the diversity of the emission sources, from industrial and laboratory scale thermal processes. Only three of the 17 toxic congeners (1,2,3,6,7,8-HxCDD, 2,3,7,8-TCDF and OCDF), whose formation was found to be linearly independent, were necessary as inputs for the models. All model parameters have been statistically validated and their confidence intervals have been calculated using the Bootstrap method. The resulting coefficients of determination (R2) for the models are 0.9711 ± 0.0056 and 0.9583 ± 0.0085; its root mean square errors (RMSE) are 0.2115 and 0.2424, and its mean absolute errors (MAE) are 0.1541 and 0.1733 respectively.


2021 ◽  
Vol 13 (4) ◽  
pp. 581 ◽  
Author(s):  
Yuanyuan Fu ◽  
Guijun Yang ◽  
Xiaoyu Song ◽  
Zhenhong Li ◽  
Xingang Xu ◽  
...  

Rapid and accurate crop aboveground biomass estimation is beneficial for high-throughput phenotyping and site-specific field management. This study explored the utility of high-definition digital images acquired by a low-flying unmanned aerial vehicle (UAV) and ground-based hyperspectral data for improved estimates of winter wheat biomass. To extract fine textures for characterizing the variations in winter wheat canopy structure during growing seasons, we proposed a multiscale texture extraction method (Multiscale_Gabor_GLCM) that took advantages of multiscale Gabor transformation and gray-level co-occurrency matrix (GLCM) analysis. Narrowband normalized difference vegetation indices (NDVIs) involving all possible two-band combinations and continuum removal of red-edge spectra (SpeCR) were also extracted for biomass estimation. Subsequently, non-parametric linear (i.e., partial least squares regression, PLSR) and nonlinear regression (i.e., least squares support vector machine, LSSVM) analyses were conducted using the extracted spectral features, multiscale textural features and combinations thereof. The visualization technique of LSSVM was utilized to select the multiscale textures that contributed most to the biomass estimation for the first time. Compared with the best-performing NDVI (1193, 1222 nm), the SpeCR yielded higher coefficient of determination (R2), lower root mean square error (RMSE), and lower mean absolute error (MAE) for winter wheat biomass estimation and significantly alleviated the saturation problem after biomass exceeded 800 g/m2. The predictive performance of the PLSR and LSSVM regression models based on SpeCR decreased with increasing bandwidths, especially at bandwidths larger than 11 nm. Both the PLSR and LSSVM regression models based on the multiscale textures produced higher accuracies than those based on the single-scale GLCM-based textures. According to the evaluation of variable importance, the texture metrics “Mean” from different scales were determined as the most influential to winter wheat biomass. Using just 10 multiscale textures largely improved predictive performance over using all textures and achieved an accuracy comparable with using SpeCR. The LSSVM regression model based on the combination of the selected multiscale textures, and SpeCR with a bandwidth of 9 nm produced the highest estimation accuracy with R2val = 0.87, RMSEval = 119.76 g/m2, and MAEval = 91.61 g/m2. However, the combination did not significantly improve the estimation accuracy, compared to the use of SpeCR or multiscale textures only. The accuracy of the biomass predicted by the LSSVM regression models was higher than the results of the PLSR models, which demonstrated LSSVM was a potential candidate to characterize winter wheat biomass during multiple growth stages. The study suggests that multiscale textures derived from high-definition UAV-based digital images are competitive with hyperspectral features in predicting winter wheat biomass.


Universe ◽  
2021 ◽  
Vol 7 (1) ◽  
pp. 8
Author(s):  
Alessandro Montoli ◽  
Marco Antonelli ◽  
Brynmor Haskell ◽  
Pierre Pizzochero

A common way to calculate the glitch activity of a pulsar is an ordinary linear regression of the observed cumulative glitch history. This method however is likely to underestimate the errors on the activity, as it implicitly assumes a (long-term) linear dependence between glitch sizes and waiting times, as well as equal variance, i.e., homoscedasticity, in the fit residuals, both assumptions that are not well justified from pulsar data. In this paper, we review the extrapolation of the glitch activity parameter and explore two alternatives: the relaxation of the homoscedasticity hypothesis in the linear fit and the use of the bootstrap technique. We find a larger uncertainty in the activity with respect to that obtained by ordinary linear regression, especially for those objects in which it can be significantly affected by a single glitch. We discuss how this affects the theoretical upper bound on the moment of inertia associated with the region of a neutron star containing the superfluid reservoir of angular momentum released in a stationary sequence of glitches. We find that this upper bound is less tight if one considers the uncertainty on the activity estimated with the bootstrap method and allows for models in which the superfluid reservoir is entirely in the crust.


1998 ◽  
Vol 217 (1) ◽  
Author(s):  
Hans Schneeberger

SummaryWith Efron’s law-school example the bootstrap method is compared with an alternative method, called doubling. It is shown, that the mean deviation of the estimator is always smaller for the doubling method.


2016 ◽  
Vol 16 (2) ◽  
pp. 43-50 ◽  
Author(s):  
Samander Ali Malik ◽  
Assad Farooq ◽  
Thomas Gereke ◽  
Chokri Cherif

Abstract The present research work was carried out to develop the prediction models for blended ring spun yarn evenness and tensile parameters using artificial neural networks (ANNs) and multiple linear regression (MLR). Polyester/cotton blend ratio, twist multiplier, back roller hardness and break draft ratio were used as input parameters to predict yarn evenness in terms of CVm% and yarn tensile properties in terms of tenacity and elongation. Feed forward neural networks with Bayesian regularisation support were successfully trained and tested using the available experimental data. The coefficients of determination of ANN and regression models indicate that there is a strong correlation between the measured and predicted yarn characteristics with an acceptable mean absolute error values. The comparative analysis of two modelling techniques shows that the ANNs perform better than the MLR models. The relative importance of input variables was determined using rank analysis through input saliency test on optimised ANN models and standardised coefficients of regression models. These models are suitable for yarn manufacturers and can be used within the investigated knowledge domain.


1992 ◽  
Vol 82 (1) ◽  
pp. 104-119
Author(s):  
Michéle Lamarre ◽  
Brent Townshend ◽  
Haresh C. Shah

Abstract This paper describes a methodology to assess the uncertainty in seismic hazard estimates at particular sites. A variant of the bootstrap statistical method is used to combine the uncertainty due to earthquake catalog incompleteness, earthquake magnitude, and recurrence and attenuation models used. The uncertainty measure is provided in the form of a confidence interval. Comparisons of this method applied to various sites in California with previous studies are used to confirm the validity of the method.


2021 ◽  
Vol 42 (Supplement_1) ◽  
pp. S33-S34
Author(s):  
Morgan A Taylor ◽  
Randy D Kearns ◽  
Jeffrey E Carter ◽  
Mark H Ebell ◽  
Curt A Harris

Abstract Introduction A nuclear disaster would generate an unprecedented volume of thermal burn patients from the explosion and subsequent mass fires (Figure 1). Prediction models characterizing outcomes for these patients may better equip healthcare providers and other responders to manage large scale nuclear events. Logistic regression models have traditionally been employed to develop prediction scores for mortality of all burn patients. However, other healthcare disciplines have increasingly transitioned to machine learning (ML) models, which are automatically generated and continually improved, potentially increasing predictive accuracy. Preliminary research suggests ML models can predict burn patient mortality more accurately than commonly used prediction scores. The purpose of this study is to examine the efficacy of various ML methods in assessing thermal burn patient mortality and length of stay in burn centers. Methods This retrospective study identified patients with fire/flame burn etiologies in the National Burn Repository between the years 2009 – 2018. Patients were randomly partitioned into a 67%/33% split for training and validation. A random forest model (RF) and an artificial neural network (ANN) were then constructed for each outcome, mortality and length of stay. These models were then compared to logistic regression models and previously developed prediction tools with similar outcomes using a combination of classification and regression metrics. Results During the study period, 82,404 burn patients with a thermal etiology were identified in the analysis. The ANN models will likely tend to overfit the data, which can be resolved by ending the model training early or adding additional regularization parameters. Further exploration of the advantages and limitations of these models is forthcoming as metric analyses become available. Conclusions In this proof-of-concept study, we anticipate that at least one ML model will predict the targeted outcomes of thermal burn patient mortality and length of stay as judged by the fidelity with which it matches the logistic regression analysis. These advancements can then help disaster preparedness programs consider resource limitations during catastrophic incidents resulting in burn injuries.


Sign in / Sign up

Export Citation Format

Share Document