Approximate Bayesian computation for machine learning, inverse problems and big data

2017 ◽  
Author(s):  
Ali Mohammad-Djafari
2021 ◽  
pp. 25-37
Author(s):  
Manuel Chiachío-Ruano ◽  
Juan Chiachío-Ruano ◽  
María L. Jalón

2020 ◽  
Author(s):  
Marcelo Gehara ◽  
Guilherme G. Mazzochinni ◽  
Frank Burbrink

AbstractUnderstanding population divergence involves testing diversification scenarios and estimating historical parameters, such as divergence time, population size and migration rate. There is, however, an immense space of possible highly parameterized scenarios that are difsficult or impossible to solve analytically. To overcome this problem researchers have used alternative simulation-based approaches, such as approximate Bayesian computation (ABC) and supervised machine learning (SML), to approximate posterior probabilities of hypotheses. In this study we demonstrate the utility of our newly developed R-package to simulate summary statistics to perform ABC and SML inferences. We compare the power of both ABC and SML methods and the influence of the number of loci in the accuracy of inferences; and we show three empirical examples: (i) the Muller’s termite frog genomic data from Southamerica; (ii) the cottonmouth and (iii) and the copperhead snakes sanger data from Northamerica. We found that SML is more efficient than ABC. It is generally more accurate and needs fewer simulations to perform an inference. We found support for a divergence model without migration, with a recent bottleneck for one of the populations of the southamerican frog. For the cottonmouth we found support for divergence with migration and recent expansion and for the copperhead we found support for a model of divergence with migration and recent bottleneck. Interestingly, by using an SML method it was possible to achieve high accuracy in model selection even when several models were compared in a single inference. We also found a higher accuracy when inferring parameters with SML.


2016 ◽  
Author(s):  
Emma Saulnier ◽  
Olivier Gascuel ◽  
Samuel Alizon

AbstractPhylodynamics typically rely on likelihood-based methods to infer epidemiological parameters from dated phylogenies. These methods are essentially based on simple epidemiological models because of the difficulty in expressing the likelihood function analytically. Computing this function numerically raises additional challenges, especially for large phylogenies. Here, we use Approximate Bayesian Computation (ABC) to circumvent these problems. ABC is a likelihood-free method of parameter inference, based on simulation and comparison between target data and simulated data, using summary statistics. We simulated target trees under several epidemiological scenarios in order to assess the accuracy of ABC methods for inferring epidemiological parameter such as the basic reproduction number (R0), the mean duration of infection, and the effective host population size. We designed many summary statistics to capture the information in a phylogeny and its corresponding lineage-through-time plot. We then used the simplest ABC method, called rejection, and its modern derivative complemented with adjustment of the posterior distribution by regression. The availability of machine learning techniques including variable selection, motivated us to compute many summary statistics on the phylogeny. We found that ABC-based inference reaches an accuracy comparable to that of likelihood-based methods for birth-death models and can even outperform existing methods for more refined models and large trees. By re-analysing data from the early stages of the recent Ebola epidemic in Sierra Leone, we also found that ABC provides more realistic estimates than the likelihood-based methods, for some parameters. This work shows that the combination of ABC-based inference using many summary statistics and sophisticated machine learning methods able to perform variable selection is a promising approach to analyse large phylogenies and non-trivial models.


Author(s):  
Cecilia Viscardi ◽  
Michele Boreale ◽  
Fabio Corradi

AbstractWe consider the problem of sample degeneracy in Approximate Bayesian Computation. It arises when proposed values of the parameters, once given as input to the generative model, rarely lead to simulations resembling the observed data and are hence discarded. Such “poor” parameter proposals do not contribute at all to the representation of the parameter’s posterior distribution. This leads to a very large number of required simulations and/or a waste of computational resources, as well as to distortions in the computed posterior distribution. To mitigate this problem, we propose an algorithm, referred to as the Large Deviations Weighted Approximate Bayesian Computation algorithm, where, via Sanov’s Theorem, strictly positive weights are computed for all proposed parameters, thus avoiding the rejection step altogether. In order to derive a computable asymptotic approximation from Sanov’s result, we adopt the information theoretic “method of types” formulation of the method of Large Deviations, thus restricting our attention to models for i.i.d. discrete random variables. Finally, we experimentally evaluate our method through a proof-of-concept implementation.


2021 ◽  
Vol 62 (2) ◽  
Author(s):  
Jason D. Christopher ◽  
Olga A. Doronina ◽  
Dan Petrykowski ◽  
Torrey R. S. Hayden ◽  
Caelan Lapointe ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document