Approximate Bayesian computation for machine learning, inverse problems and big data

AbstractUnderstanding population divergence involves testing diversification scenarios and estimating historical parameters, such as divergence time, population size and migration rate. There is, however, an immense space of possible highly parameterized scenarios that are difsficult or impossible to solve analytically. To overcome this problem researchers have used alternative simulation-based approaches, such as approximate Bayesian computation (ABC) and supervised machine learning (SML), to approximate posterior probabilities of hypotheses. In this study we demonstrate the utility of our newly developed R-package to simulate summary statistics to perform ABC and SML inferences. We compare the power of both ABC and SML methods and the influence of the number of loci in the accuracy of inferences; and we show three empirical examples: (i) the Muller’s termite frog genomic data from Southamerica; (ii) the cottonmouth and (iii) and the copperhead snakes sanger data from Northamerica. We found that SML is more efficient than ABC. It is generally more accurate and needs fewer simulations to perform an inference. We found support for a divergence model without migration, with a recent bottleneck for one of the populations of the southamerican frog. For the cottonmouth we found support for divergence with migration and recent expansion and for the copperhead we found support for a model of divergence with migration and recent bottleneck. Interestingly, by using an SML method it was possible to achieve high accuracy in model selection even when several models were compared in a single inference. We also found a higher accuracy when inferring parameters with SML.

Download Full-text

Uncertainty calibration of building energy models by combining approximate Bayesian computation and machine learning algorithms

Applied Energy ◽

10.1016/j.apenergy.2020.115025 ◽

2020 ◽

Vol 268 ◽

pp. 115025

Author(s):

Chuanqi Zhu ◽

Wei Tian ◽

Baoquan Yin ◽

Zhanyong Li ◽

Jiaxin Shi

Keyword(s):

Machine Learning ◽

Approximate Bayesian Computation ◽

Learning Algorithms ◽

Building Energy ◽

Machine Learning Algorithms ◽

Bayesian Computation ◽

Energy Models ◽

Approximate Bayesian

Download Full-text

Assessing the accuracy of Approximate Bayesian Computation approaches to infer epidemiological parameters from phylogenies

10.1101/050211 ◽

2016 ◽

Cited By ~ 1

Author(s):

Emma Saulnier ◽

Olivier Gascuel ◽

Samuel Alizon

Keyword(s):

Machine Learning ◽

Variable Selection ◽

Approximate Bayesian Computation ◽

Likelihood Function ◽

Machine Learning Techniques ◽

Bayesian Computation ◽

Summary Statistics ◽

Large Trees ◽

Approximate Bayesian ◽

Epidemiological Parameters

AbstractPhylodynamics typically rely on likelihood-based methods to infer epidemiological parameters from dated phylogenies. These methods are essentially based on simple epidemiological models because of the difficulty in expressing the likelihood function analytically. Computing this function numerically raises additional challenges, especially for large phylogenies. Here, we use Approximate Bayesian Computation (ABC) to circumvent these problems. ABC is a likelihood-free method of parameter inference, based on simulation and comparison between target data and simulated data, using summary statistics. We simulated target trees under several epidemiological scenarios in order to assess the accuracy of ABC methods for inferring epidemiological parameter such as the basic reproduction number (R0), the mean duration of infection, and the effective host population size. We designed many summary statistics to capture the information in a phylogeny and its corresponding lineage-through-time plot. We then used the simplest ABC method, called rejection, and its modern derivative complemented with adjustment of the posterior distribution by regression. The availability of machine learning techniques including variable selection, motivated us to compute many summary statistics on the phylogeny. We found that ABC-based inference reaches an accuracy comparable to that of likelihood-based methods for birth-death models and can even outperform existing methods for more refined models and large trees. By re-analysing data from the early stages of the recent Ebola epidemic in Sierra Leone, we also found that ABC provides more realistic estimates than the likelihood-based methods, for some parameters. This work shows that the combination of ABC-based inference using many summary statistics and sophisticated machine learning methods able to perform variable selection is a promising approach to analyse large phylogenies and non-trivial models.

Download Full-text

Gaussian process enhanced semi-automatic approximate Bayesian computation: parameter inference in a stochastic differential equation system for chemotaxis

Journal of Computational Physics ◽

10.1016/j.jcp.2020.109999 ◽

2020 ◽

pp. 109999

Author(s):

Agnieszka Borowska ◽

Diana Giurghita ◽

Dirk Husmeier

Keyword(s):

Differential Equation ◽

Stochastic Differential Equation ◽

Gaussian Process ◽

Approximate Bayesian Computation ◽

Equation System ◽

Bayesian Computation ◽

Differential Equation System ◽

Parameter Inference ◽

Approximate Bayesian

Download Full-text

Weighted approximate Bayesian computation via Sanov’s theorem

Computational Statistics ◽

10.1007/s00180-021-01093-4 ◽

2021 ◽

Author(s):

Cecilia Viscardi ◽

Michele Boreale ◽

Fabio Corradi

Keyword(s):

Large Deviations ◽

Posterior Distribution ◽

Approximate Bayesian Computation ◽

Bayesian Computation ◽

Information Theoretic ◽

Discrete Random Variables ◽

Positive Weights ◽

Approximate Bayesian ◽

Information Theoretic Method ◽

Computational Resources

AbstractWe consider the problem of sample degeneracy in Approximate Bayesian Computation. It arises when proposed values of the parameters, once given as input to the generative model, rarely lead to simulations resembling the observed data and are hence discarded. Such “poor” parameter proposals do not contribute at all to the representation of the parameter’s posterior distribution. This leads to a very large number of required simulations and/or a waste of computational resources, as well as to distortions in the computed posterior distribution. To mitigate this problem, we propose an algorithm, referred to as the Large Deviations Weighted Approximate Bayesian Computation algorithm, where, via Sanov’s Theorem, strictly positive weights are computed for all proposed parameters, thus avoiding the rejection step altogether. In order to derive a computable asymptotic approximation from Sanov’s result, we adopt the information theoretic “method of types” formulation of the method of Large Deviations, thus restricting our attention to models for i.i.d. discrete random variables. Finally, we experimentally evaluate our method through a proof-of-concept implementation.

Download Full-text

Inference of Brain Networks with Approximate Bayesian Computation – assessing face validity with an example application in Parkinsonism

NeuroImage ◽

10.1016/j.neuroimage.2021.118020 ◽

2021 ◽

pp. 118020

Author(s):

Timothy O. West ◽

Luc Berthouze ◽

Simon F. Farmer ◽

Hayriye Cagnan ◽

Vladimir Litvak

Keyword(s):

Approximate Bayesian Computation ◽

Brain Networks ◽

Face Validity ◽

Bayesian Computation ◽

Approximate Bayesian

Download Full-text

Flow parameter estimation using laser absorption spectroscopy and approximate Bayesian computation

Experiments in Fluids ◽

10.1007/s00348-020-03122-2 ◽

2021 ◽

Vol 62 (2) ◽

Author(s):

Jason D. Christopher ◽

Olga A. Doronina ◽

Dan Petrykowski ◽

Torrey R. S. Hayden ◽

Caelan Lapointe ◽

...

Keyword(s):

Parameter Estimation ◽

Absorption Spectroscopy ◽

Approximate Bayesian Computation ◽

Flow Parameter ◽

Bayesian Computation ◽

Laser Absorption Spectroscopy ◽

Laser Absorption ◽

Approximate Bayesian

Download Full-text