Computationally Efficient Variational Approximations for Bayesian Inverse Problems

Author(s):  
Panagiotis Tsilifis ◽  
Ilias Bilionis ◽  
Ioannis Katsounaros ◽  
Nicholas Zabaras

The major drawback of the Bayesian approach to model calibration is the computational burden involved in describing the posterior distribution of the unknown model parameters arising from the fact that typical Markov chain Monte Carlo (MCMC) samplers require thousands of forward model evaluations. In this work, we develop a variational Bayesian approach to model calibration which uses an information theoretic criterion to recast the posterior problem as an optimization problem. Specifically, we parameterize the posterior using the family of Gaussian mixtures and seek to minimize the information loss incurred by replacing the true posterior with an approximate one. Our approach is of particular importance in underdetermined problems with expensive forward models in which both the classical approach of minimizing a potentially regularized misfit function and MCMC are not viable options. We test our methodology on two surrogate-free examples and show that it dramatically outperforms MCMC methods.

Author(s):  
Nurliyana JUHAN ◽  
Yong Zulina ZUBAIRI ◽  
Zarina Mohd KHALID ◽  
Ahmad Syadi MAHMOOD ZUHDI

Background: Identifying risk factors associated with mortality is important in providing better prognosis to patients. Consistent with that, Bayesian approach offers a great advantage where it rests on the assumption that all model parameters are random quantities and hence can incorporate prior knowledge. Therefore, we aimed to develop a reliable model to identify risk factors associated with mortality among ST-Elevation Myocardial Infarction (STEMI) male patients using Bayesian approach. Methods: A total of 7180 STEMI male patients from the National Cardiovascular Disease Database-Acute Coronary Syndrome (NCVD-ACS) registry for the years 2006-2013 were enrolled. In the development of univariate and multivariate logistic regression model for the STEMI patients, Bayesian Markov Chain Monte Carlo (MCMC) simulation approach was applied. The performance of the model was assessed through convergence diagnostics, overall model fit, model calibration and discrimination. Results: A set of six risk factors for cardiovascular death among STEMI male patients were identified from the Bayesian multivariate logistic model namely age, diabetes mellitus, family history of CVD, Killip class, chronic lung disease and renal disease respectively. Overall model fit, model calibration and discrimination were considered good for the proposed model. Conclusion: Bayesian risk prediction model for CVD male patients identified six risk factors associated with mortality. Among the highest risks were Killip class (OR=18.0), renal disease (2.46) and age group (OR=2.43) respectively.


Author(s):  
Scott N. Walsh ◽  
Tim M. Wildey ◽  
John D. Jakeman

We consider the utilization of a computational model to guide the optimal acquisition of experimental data to inform the stochastic description of model input parameters. Our formulation is based on the recently developed consistent Bayesian approach for solving stochastic inverse problems, which seeks a posterior probability density that is consistent with the model and the data in the sense that the push-forward of the posterior (through the computational model) matches the observed density on the observations almost everywhere. Given a set of potential observations, our optimal experimental design (OED) seeks the observation, or set of observations, that maximizes the expected information gain from the prior probability density on the model parameters. We discuss the characterization of the space of observed densities and a computationally efficient approach for rescaling observed densities to satisfy the fundamental assumptions of the consistent Bayesian approach. Numerical results are presented to compare our approach with existing OED methodologies using the classical/statistical Bayesian approach and to demonstrate our OED on a set of representative partial differential equations (PDE)-based models.


Author(s):  
Marcello Pericoli ◽  
Marco Taboga

Abstract We propose a general method for the Bayesian estimation of a very broad class of non-linear no-arbitrage term-structure models. The main innovation we introduce is a computationally efficient method, based on deep learning techniques, for approximating no-arbitrage model-implied bond yields to any desired degree of accuracy. Once the pricing function is approximated, the posterior distribution of model parameters and unobservable state variables can be estimated by standard Markov Chain Monte Carlo methods. As an illustrative example, we apply the proposed techniques to the estimation of a shadow-rate model with a time-varying lower bound and unspanned macroeconomic factors.


Water ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 1484
Author(s):  
Dagmar Dlouhá ◽  
Viktor Dubovský ◽  
Lukáš Pospíšil

We present an approach for the calibration of simplified evaporation model parameters based on the optimization of parameters against the most complex model for evaporation estimation, i.e., the Penman–Monteith equation. This model computes the evaporation from several input quantities, such as air temperature, wind speed, heat storage, net radiation etc. However, sometimes all these values are not available, therefore we must use simplified models. Our interest in free water surface evaporation is given by the need for ongoing hydric reclamation of the former Ležáky–Most quarry, i.e., the ongoing restoration of the land that has been mined to a natural and economically usable state. For emerging pit lakes, the prediction of evaporation and the level of water plays a crucial role. We examine the methodology on several popular models and standard statistical measures. The presented approach can be applied in a general model calibration process subject to any theoretical or measured evaporation.


2020 ◽  
Vol 70 (1) ◽  
pp. 145-161 ◽  
Author(s):  
Marnus Stoltz ◽  
Boris Baeumer ◽  
Remco Bouckaert ◽  
Colin Fox ◽  
Gordon Hiscott ◽  
...  

Abstract We describe a new and computationally efficient Bayesian methodology for inferring species trees and demographics from unlinked binary markers. Likelihood calculations are carried out using diffusion models of allele frequency dynamics combined with novel numerical algorithms. The diffusion approach allows for analysis of data sets containing hundreds or thousands of individuals. The method, which we call Snapper, has been implemented as part of the BEAST2 package. We conducted simulation experiments to assess numerical error, computational requirements, and accuracy recovering known model parameters. A reanalysis of soybean SNP data demonstrates that the models implemented in Snapp and Snapper can be difficult to distinguish in practice, a characteristic which we tested with further simulations. We demonstrate the scale of analysis possible using a SNP data set sampled from 399 fresh water turtles in 41 populations. [Bayesian inference; diffusion models; multi-species coalescent; SNP data; species trees; spectral methods.]


2013 ◽  
Vol 135 (12) ◽  
Author(s):  
Arun V. Kolanjiyil ◽  
Clement Kleinstreuer

This is the second article of a two-part paper, combining high-resolution computer simulation results of inhaled nanoparticle deposition in a human airway model (Kolanjiyil and Kleinstreuer, 2013, “Nanoparticle Mass Transfer From Lung Airways to Systemic Regions—Part I: Whole-Lung Aerosol Dynamics,” ASME J. Biomech. Eng., 135(12), p. 121003) with a new multicompartmental model for insoluble nanoparticle barrier mass transfer into systemic regions. Specifically, it allows for the prediction of temporal nanoparticle accumulation in the blood and lymphatic systems and in organs. The multicompartmental model parameters were determined from experimental retention and clearance data in rat lungs and then the validated model was applied to humans based on pharmacokinetic cross-species extrapolation. This hybrid simulator is a computationally efficient tool to predict the nanoparticle kinetics in the human body. The study provides critical insight into nanomaterial deposition and distribution from the lungs to systemic regions. The quantitative results are useful in diverse fields such as toxicology for exposure-risk analysis of ubiquitous nanomaterial and pharmacology for nanodrug development and targeting.


1993 ◽  
Vol 28 (11-12) ◽  
pp. 163-171 ◽  
Author(s):  
Weibo (Weber) Yuan ◽  
David Okrent ◽  
Michael K. Stenstrom

A model calibration algorithm is developed for the high-purity oxygen activated sludge process (HPO-ASP). The algorithm is evaluated under different conditions to determine the effect of the following factors on the performance of the algorithm: data quality, number of observations, and number of parameters to be estimated. The process model used in this investigation is the first HPO-ASP model based upon the IAWQ (formerly IAWPRC) Activated Sludge Model No. 1. The objective function is formulated as a relative least-squares function and the non-linear, constrained minimization problem is solved by the Complex method. The stoichiometric and kinetic coefficients of the IAWQ activated sludge model are the parameters focused on in this investigation. Observations used are generated numerically but are made close to the observations from a full-scale high-purity oxygen treatment plant. The calibration algorithm is capable of correctly estimating model parameters even if the observations are severely noise-corrupted. The accuracy of estimation deteriorates gradually with the increase of observation errors. The accuracy of calibration improves when the number of observations (n) increases, but the improvement becomes insignificant when n>96. It is also found that there exists an optimal number of parameters that can be rigorously estimated from a given set of information/data. A sensitivity analysis is conducted to determine what parameters to estimate and to evaluate the potential benefits resulted from collecting additional measurements.


2016 ◽  
Vol 14 (03) ◽  
pp. 1650007 ◽  
Author(s):  
Matthias Gerstgrasser ◽  
Sarah Nicholls ◽  
Michael Stout ◽  
Katherine Smart ◽  
Chris Powell ◽  
...  

Biolog phenotype microarrays (PMs) enable simultaneous, high throughput analysis of cell cultures in different environments. The output is high-density time-course data showing redox curves (approximating growth) for each experimental condition. The software provided with the Omnilog incubator/reader summarizes each time-course as a single datum, so most of the information is not used. However, the time courses can be extremely varied and often contain detailed qualitative (shape of curve) and quantitative (values of parameters) information. We present a novel, Bayesian approach to estimating parameters from Phenotype Microarray data, fitting growth models using Markov Chain Monte Carlo (MCMC) methods to enable high throughput estimation of important information, including length of lag phase, maximal “growth” rate and maximum output. We find that the Baranyi model for microbial growth is useful for fitting Biolog data. Moreover, we introduce a new growth model that allows for diauxic growth with a lag phase, which is particularly useful where Phenotype Microarrays have been applied to cells grown in complex mixtures of substrates, for example in industrial or biotechnological applications, such as worts in brewing. Our approach provides more useful information from Biolog data than existing, competing methods, and allows for valuable comparisons between data series and across different models.


2014 ◽  
Vol 7 (1) ◽  
pp. 1535-1600
Author(s):  
M. Scherstjanoi ◽  
J. O. Kaplan ◽  
H. Lischke

Abstract. To be able to simulate climate change effects on forest dynamics over the whole of Switzerland, we adapted the second generation DGVM LPJ-GUESS to the Alpine environment. We modified model functions, tuned model parameters, and implemented new tree species to represent the potential natural vegetation of Alpine landscapes. Furthermore, we increased the computational efficiency of the model to enable area-covering simulations in a fine resolution (1 km) sufficient for the complex topography of the Alps, which resulted in more than 32 000 simulation grid cells. To this aim, we applied the recently developed method GAPPARD (Scherstjanoi et al., 2013) to LPJ-GUESS. GAPPARD derives mean output values from a combination of simulation runs without disturbances and a patch age distribution defined by the disturbance frequency. With this computationally efficient method, that increased the model's speed by approximately the factor 8, we were able to faster detect shortcomings of LPJ-GUESS functions and parameters. We used the adapted LPJ-GUESS together with GAPPARD to assess the influence of one climate change scenario on dynamics of tree species composition and biomass throughout the 21st century in Switzerland. To allow for comparison with the original model, we additionally simulated forest dynamics along a north-south-transect through Switzerland. The results from this transect confirmed the high value of the GAPPARD method despite some limitations towards extreme climatic events. It allowed for the first time to obtain area-wide, detailed high resolution LPJ-GUESS simulation results for a large part of the Alpine region.


Sign in / Sign up

Export Citation Format

Share Document