scholarly journals Making Steppingstones out of Stumbling Blocks: A Bayesian Model Evidence Estimator with Application to Groundwater Transport Model Selection

Water ◽  
2019 ◽  
Vol 11 (8) ◽  
pp. 1579 ◽  
Author(s):  
Elshall ◽  
Ye

Bayesian model evidence (BME) is a measure of the average fit of a model to observation data given all the parameter values that the model can assume. By accounting for the trade-off between goodness-of-fit and model complexity, BME is used for model selection and model averaging purposes. For strict Bayesian computation, the theoretically unbiased Monte Carlo based numerical estimators are preferred over semi-analytical solutions. This study examines five BME numerical estimators and asks how accurate estimation of the BME is important for penalizing model complexity. The limiting cases for numerical BME estimators are the prior sampling arithmetic mean estimator (AM) and the posterior sampling harmonic mean (HM) estimator, which are straightforward to implement, yet they result in underestimation and overestimation, respectively. We also consider the path sampling methods of thermodynamic integration (TI) and steppingstone sampling (SS) that sample multiple intermediate distributions that link the prior and the posterior. Although TI and SS are theoretically unbiased estimators, they could have a bias in practice arising from numerical implementation. For example, sampling errors of some intermediate distributions can introduce bias. We propose a variant of SS, namely the multiple one-steppingstone sampling (MOSS) that is less sensitive to sampling errors. We evaluate these five estimators using a groundwater transport model selection problem. SS and MOSS give the least biased BME estimation at an efficient computational cost. If the estimated BME has a bias that covariates with the true BME, this would not be a problem because we are interested in BME ratios and not their absolute values. On the contrary, the results show that BME estimation bias can be a function of model complexity. Thus, biased BME estimation results in inaccurate penalization of more complex models, which changes the model ranking. This was less observed with SS and MOSS as with the three other methods.

2020 ◽  
Author(s):  
Farid Mohammadi ◽  
Stefania Scheurer ◽  
Aline Schäfer Rodrigues Silva ◽  
Sergey Oladyshkin ◽  
Johannes Hommel ◽  
...  

<p><span>The microbially induced calcite precipitation (MICP) process is a reactive transport, which consists of various important biogeochemical processes, namely precipitation, and dissolution of calcite, adhesion of the biomass on surfaces, detachment of the biomass from the biofilm as well as growth and decay of the biomass. Due to the accumulation of the biofilm and especially the calcite precipitation, the flow conditions in the subsurface can be modified and especially the porosity and permeability can be reduced, so that the existing leakages are sealed. This sealing property of MICP is of interest in different applications, such as sealing cracks in gas tanks or in a cap rock for CO</span><sub><span>2</span></sub><span> underground storage</span></p><p><span>The process of biofilm growth in porous media using MICP can be described by many models with different complexity and assumptions. Typically, complex models require more measurement data to constrain their parameters. Therefore, there is a need to seek a balance between model complexity and efforts for acquiring field data. To do so, the modelers are interested in assessing the similarities among these models and their prediction accuracy by comparing them with field observation data. </span></p><p><span>In this study, we perform a Bayesian model legitimacy analysis to investigate the similarities among different MICP models and their prediction accuracy. Moreover, this analysis provides a model ranking based on computed model weights, achieved within the framework of Bayesian model selection (BMS). This framework requires many model evaluations, which makes the analysis intractable for computationally expensive MICP models. To overcome this issue, we use surrogate models that are constructed using arbitrary polynomial chaos expansion (aPCE). To account for the approximation error, we introduce a correction factor that compensates the inaccuracies due to replacing the original models by the surrogates. </span></p>


Author(s):  
Stefania Scheurer ◽  
Aline Schäfer Rodrigues Silva ◽  
Farid Mohammadi ◽  
Johannes Hommel ◽  
Sergey Oladyshkin ◽  
...  

AbstractGeochemical processes in subsurface reservoirs affected by microbial activity change the material properties of porous media. This is a complex biogeochemical process in subsurface reservoirs that currently contains strong conceptual uncertainty. This means, several modeling approaches describing the biogeochemical process are plausible and modelers face the uncertainty of choosing the most appropriate one. The considered models differ in the underlying hypotheses about the process structure. Once observation data become available, a rigorous Bayesian model selection accompanied by a Bayesian model justifiability analysis could be employed to choose the most appropriate model, i.e. the one that describes the underlying physical processes best in the light of the available data. However, biogeochemical modeling is computationally very demanding because it conceptualizes different phases, biomass dynamics, geochemistry, precipitation and dissolution in porous media. Therefore, the Bayesian framework cannot be based directly on the full computational models as this would require too many expensive model evaluations. To circumvent this problem, we suggest to perform both Bayesian model selection and justifiability analysis after constructing surrogates for the competing biogeochemical models. Here, we will use the arbitrary polynomial chaos expansion. Considering that surrogate representations are only approximations of the analyzed original models, we account for the approximation error in the Bayesian analysis by introducing novel correction factors for the resulting model weights. Thereby, we extend the Bayesian model justifiability analysis and assess model similarities for computationally expensive models. We demonstrate the method on a representative scenario for microbially induced calcite precipitation in a porous medium. Our extension of the justifiability analysis provides a suitable approach for the comparison of computationally demanding models and gives an insight on the necessary amount of data for a reliable model performance.


2018 ◽  
Author(s):  
Eduardo A. Aponte ◽  
Sudhir Raman ◽  
Stefan Frässle ◽  
Jakob Heinzle ◽  
Will D. Penny ◽  
...  

AbstractIn generative modeling of neuroimaging data, such as dynamic causal modeling (DCM), one typically considers several alternative models, either to determine the most plausible explanation for observed data (Bayesian model selection) or to account for model uncertainty (Bayesian model averaging). Both procedures rest on estimates of the model evidence, a principled trade-off between model accuracy and complexity. In DCM, the log evidence is usually approximated using variational Bayes (VB) under the Laplace approximation (VBL). Although this approach is highly efficient, it makes distributional assumptions and can be vulnerable to local extrema. An alternative to VBL is Markov Chain Monte Carlo (MCMC) sampling, which is asymptotically exact but orders of magnitude slower than VB. This has so far prevented its routine use for DCM.This paper makes four contributions. First, we introduce a powerful MCMC scheme – thermodynamic integration (TI) – to neuroimaging and present a derivation that establishes a theoretical link to VB. Second, this derivation is based on a tutorial-like introduction to concepts of free energy in physics and statistics. Third, we present an implementation of TI for DCM that rests on population MCMC. Fourth, using simulations and empirical functional magnetic resonance imaging (fMRI) data, we compare log evidence estimates obtained by TI, VBL, and other MCMC-based estimators (prior arithmetic mean and posterior harmonic mean). We find that model comparison based on VBL gives reliable results in most cases, justifying its use in standard DCM for fMRI. Furthermore, we demonstrate that for complex and/or nonlinear models, TI may provide more robust estimates of the log evidence. Importantly, accurate estimates of the model evidence can be obtained with TI in acceptable computation time. This paves the way for using DCM in scenarios where the robustness of single-subject inference and model selection becomes paramount, such as differential diagnosis in clinical applications.


2018 ◽  
Author(s):  
Guoxiao Wei ◽  
Xiaoying Zhang ◽  
Ming Ye ◽  
Ning Yue ◽  
Fei Kan

Abstract. Evapotranspiration (ET) is a major component of the land surface process involved in energy fluxes and balance, especially in the hydrological cycle of agricultural ecosystems. While many models have been developed to estimate ET, there has been no agreement on which model has the best performance. In this study, we evaluate four widely used ET models (i.e., the Shuttleworth Wallace (SW) model, Penman-Monteith (PM) model, Priestley-Taylor and Flint-Childs (PT-FC) model, and Advection-Aridity (AA) model) by using half-hourly ET observations obtained at a spring maize field in an arid region. The model evaluation is based on Bayesian model comparison and ranking using the Bayesian model evidence (BME), which balances between goodness-of-fit to data and model complexity. The BME-based model ranking (from the best to the worst) is SW, PM, PT-FC, and AA. The residuals between observations and corresponding model simulations are also analyzed, and the same model ranking is also obstained by using residual-based statistics, i.e., the coefficient of determination (R2), index of agreement (IA), root mean square error (RMSE) and model efficiency (EF). The PM and SW models overestimate ET, whereas the PT-FC and AA models underestimate ET in the study period. The four models also underestimate ET during the periods of partial crop cover. Especially during the late maturity stage, the PT-FC and AA models consistently produce an underestimation, and provide the worst simulated ET. As a result, at the half-hourly time scale, the SW model is the best model and recommend as the first choice for evaluating ET of spring maize in arid desert oasis areas.


2003 ◽  
Vol 15 (7) ◽  
pp. 1691-1714 ◽  
Author(s):  
Vladimir Cherkassky ◽  
Yunqian Ma

We discuss empirical comparison of analytical methods for model selection. Currently, there is no consensus on the best method for finite-sample estimation problems, even for the simple case of linear estimators. This article presents empirical comparisons between classical statistical methods—Akaike information criterion (AIC) and Bayesian information criterion (BIC)—and the structural risk minimization (SRM) method, basedon Vapnik-Chervonenkis (VC) theory, for regression problems. Our study is motivated by empirical comparisons in Hastie, Tibshirani, and Friedman (2001), which claims that the SRM method performs poorly for model selection and suggests that AIC yields superior predictive performance. Hence, we present empirical comparisons for various data sets and different types of estimators (linear, subset selection, and k-nearest neighbor regression). Our results demonstrate the practical advantages of VC-based model selection; it consistently outperforms AIC for all data sets. In our study, SRM and BIC methods show similar predictive performance. This discrepancy (between empirical results obtained using the same data) is caused by methodological drawbacks in Hastie et al. (2001), especially in their loose interpretation and application of SRM method. Hence, we discuss methodological issues important for meaningful comparisons and practical application of SRM method. We also point out the importance of accurate estimation of model complexity (VC-dimension) for empirical comparisons and propose a new practical estimate of model complexity for k-nearest neighbors regression.


Entropy ◽  
2018 ◽  
Vol 20 (8) ◽  
pp. 575
Author(s):  
Trevor Herntier ◽  
Koffi Ihou ◽  
Anthony Smith ◽  
Anand Rangarajan ◽  
Adrian Peter

We consider the problem of model selection using the Minimum Description Length (MDL) criterion for distributions with parameters on the hypersphere. Model selection algorithms aim to find a compromise between goodness of fit and model complexity. Variables often considered for complexity penalties involve number of parameters, sample size and shape of the parameter space, with the penalty term often referred to as stochastic complexity. Current model selection criteria either ignore the shape of the parameter space or incorrectly penalize the complexity of the model, largely because typical Laplace approximation techniques yield inaccurate results for curved spaces. We demonstrate how the use of a constrained Laplace approximation on the hypersphere yields a novel complexity measure that more accurately reflects the geometry of these spherical parameters spaces. We refer to this modified model selection criterion as spherical MDL. As proof of concept, spherical MDL is used for bin selection in histogram density estimation, performing favorably against other model selection criteria.


Author(s):  
Martin Kerscher ◽  
Jochen Weller

We review some of the common methods for model selection: the goodness of fit, the likelihood ratio test, Bayesian model selection using Bayes factors, and the classical as well as the Bayesian information theoretic approaches. We illustrate these different approaches by comparing models for the expansion history of the Universe. In the discussion we highlight the premises and objectives entering these different approaches to model selection and finally recommend the information theoretic approach.


2014 ◽  
Vol 50 (12) ◽  
pp. 9484-9513 ◽  
Author(s):  
Anneli Schöniger ◽  
Thomas Wöhling ◽  
Luis Samaniego ◽  
Wolfgang Nowak

2008 ◽  
Vol 65 (11) ◽  
pp. 2389-2398 ◽  
Author(s):  
Yan Jiao ◽  
Richard Neves ◽  
Jess Jones

Appropriate inference of population status for endangered species is extremely important. Using a single model for estimating population growth rates is typically inadequate for assessing endangered species because inferences based on only one “best” model ignore model uncertainty. In this study, the endangered dromedary pearlymussel ( Dromus dromas ) in the Clinch and Powell rivers of eastern Tennessee, USA, was used as an example to demonstrate the importance of multiple models, with consideration of environmental noises for evaluating population growth. Our results showed that more than one model deserves consideration in making inferences of population growth rate. A Bayesian model averaging approach was used to make inferences by weighting each model using the deviance information criterion. To test the uncertainty resulting from model selection and the efficiency of the Bayesian averaging approach, a simulation study was conducted on the dromedary pearlymussel populations, which showed that model selection uncertainty is very high. The results of these tests lead us to recommend using Bayesian model averaging to assess population growth status for endangered species, by balancing goodness-of-fit and selection uncertainty among alternate models.


Sign in / Sign up

Export Citation Format

Share Document