scholarly journals Adaptive Tree Proposals for Bayesian Phylogenetic Inference

2019 ◽  
Author(s):  
X. Meyer

AbstractBayesian inference of phylogenies with MCMC is without a doubt a staple in the study of evolution. Yet, this method still suffers from a practical challenge identified more than two decades ago: designing tree topology proposals that efficiently sample the tree space. In this article, I introduce the concept of tree topology proposals that adapt to the posterior distribution as it is estimated. I use this concept to elaborate two adaptive variants of existing proposals and an adaptive proposal based on a novel design philosophy in which the structure of the proposal is informed by the posterior distribution of trees. I investigate the performance of these proposals by first presenting a metric that captures the performance of each proposals within a mixture. Using this metric, I then compare the adaptive proposals performance to the performance of standard and parsimony-guided proposals on 11 empirical datasets. Using adaptive proposals led to consistent performance gains and resulted in up to 18-fold increases in mixing efficiency and 6-fold increases in converge rate without increasing the computational cost of these analyses. [Bayesian inference; Adaptive tree proposals; Markov chain Monte Carlo; phylogenetics; posterior probability distribution.]

2021 ◽  
Author(s):  
X Meyer

Abstract Bayesian inference of phylogeny with MCMC plays a key role in the study of evolution. Yet, this method still suffers from a practical challenge identified more than two decades ago: designing tree topology proposals that efficiently sample tree spaces. In this article, I introduce the concept of adaptive tree proposals for unrooted topologies, that is tree proposals adapting to the posterior distribution as it is estimated. I use this concept to elaborate two adaptive variants of existing proposals and an adaptive proposal based on a novel design philosophy in which the structure of the proposal is informed by the posterior distribution of trees. I investigate the performance of these proposals by first presenting a metric that captures the performance of each proposal within a mixture of proposals. Using this metric, I compare the performance of the adaptive proposals to the performance of standard and parsimony-guided proposals on 11 empirical datasets. Using adaptive proposals led to consistent performance gains and resulted in up to 18-fold increases in mixing efficiency and 6-fold increases in convergence rate without increasing the computational cost of these analyses.


2019 ◽  
Author(s):  
Johnny van Doorn ◽  
Dora Matzke ◽  
Eric-Jan Wagenmakers

Sir Ronald Fisher's venerable experiment "The Lady Tasting Tea'' is revisited from a Bayesian perspective. We demonstrate how a similar tasting experiment, conducted in a classroom setting, can familiarize students with several key concepts of Bayesian inference, such as the prior distribution, the posterior distribution, the Bayes factor, and sequential analysis.


2020 ◽  
Author(s):  
Sebastian Höhna ◽  
Allison Y. Hsiang

AbstractThe ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost that would be incurred. Instead, phylogenetic pipelines generally consist of chained analyses, whereby a single point estimate from a given analysis is used as input for the next analysis in the chain (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step in the chain, which can lead to inaccurate or spuriously certain results. Here, we formally develop and test the stepwise approach to Bayesian inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior produced in the previous step. We show that this approach is identical to the joint approach given sufficient information in the data and in the importance sample. This is demonstrated using both a toy example and an analysis pipeline for inferring divergence times using a relaxed clock model. The stepwise approach presented here not only accounts for uncertainty between analysis steps, but also allows for greater flexibility in program choice (and hence model availability) and can be more computationally efficient than the traditional joint approach when multiple models are being tested.


Author(s):  
Waad Subber ◽  
Sayan Ghosh ◽  
Piyush Pandita ◽  
Yiming Zhang ◽  
Liping Wang

Industrial dynamical systems often exhibit multi-scale response due to material heterogeneities, operation conditions and complex environmental loadings. In such problems, it is the case that the smallest length-scale of the systems dynamics controls the numerical resolution required to effectively resolve the embedded physics. In practice however, high numerical resolutions is only required in a confined region of the system where fast dynamics or localized material variability are exhibited, whereas a coarser discretization can be sufficient in the rest majority of the system. To this end, a unified computational scheme with uniform spatio-temporal resolutions for uncertainty quantification can be very computationally demanding. Partitioning the complex dynamical system into smaller easier-to-solve problems based of the localized dynamics and material variability can reduce the overall computational cost. However, identifying the region of interest for high-resolution and intensive uncertainty quantification can be a problem dependent. The region of interest can be specified based on the localization features of the solution, user interest, and correlation length of the random material properties. For problems where a region of interest is not evident, Bayesian inference can provide a feasible solution. In this work, we employ a Bayesian framework to update our prior knowledge on the localized region of interest using measurements and system response. To address the computational cost of the Bayesian inference, we construct a Gaussian process surrogate for the forward model. Once, the localized region of interest is identified, we use polynomial chaos expansion to propagate the localization uncertainty. We demonstrate our framework through numerical experiments on a three-dimensional elastodynamic problem


2021 ◽  
Author(s):  
Russell T. Johnson ◽  
Daniel Lakeland ◽  
James M. Finley

Background: Musculoskeletal modeling is currently a preferred method for estimating the muscle forces that underlie observed movements. However, these estimates are sensitive to a variety of assumptions and uncertainties, which creates difficulty when trying to interpret the muscle forces from musculoskeletal simulations. Here, we describe an approach that uses Bayesian inference to identify plausible ranges of muscle forces for a simple motion while representing uncertainty in the measurement of the motion and the objective function used to solve the muscle redundancy problem. Methods: We generated a reference elbow flexion-extension motion by simulating a set of muscle excitation signals derived from the computed muscle control tool built into OpenSim. We then used a Markov Chain Monte Carlo (MCMC) algorithm to sample from a posterior probability distribution of muscle excitations that would result in the reference elbow motion trajectory. We constructed a prior over the excitation parameters which down-weighted regions of the parameter space with greater muscle excitations. We used muscle excitations to find the corresponding kinematics using OpenSim, where the error in position and velocity trajectories (likelihood function) was combined with the sum of the cubed muscle excitations integrated over time (prior function) to compute the posterior probability density. Results: We evaluated the muscle forces that resulted from the set of excitations that were visited in the MCMC chain (five parallel chains, 450,000 iterations per chain, runtime = 71 hours). The estimated muscle forces compared favorably with the reference motion from computed muscle control, while the elbow angle and velocity from MCMC matched closely with the reference with an average RMSE for angle and velocity equal to 0.008° and 0.18°/s, respectively. However, our rank plot analysis and potential scale reduction statistics, which we used to evaluate convergence of the algorithm, indicated that the parallel chains did not fully mix. Conclusions: While the results from this process are a promising step towards characterizing uncertainty in muscle force estimation, the computational time required to search the solution space with, and the lack of MCMC convergence indicates that further developments in MCMC algorithms are necessary for this process to become feasible for larger-scale models.


Author(s):  
Markku Kuismin ◽  
Mikko J Sillanpää

Abstract Motivation Graphical lasso (Glasso) is a widely used tool for identifying gene regulatory networks in systems biology. However, its computational efficiency depends on the choice of regularization parameter (tuning parameter), and selecting this parameter can be highly time consuming. Although fully Bayesian implementations of Glasso alleviate this problem somewhat by specifying a priori distribution for the parameter, these approaches lack the scalability of their frequentist counterparts. Results Here, we present a new Monte Carlo Penalty Selection method (MCPeSe), a computationally efficient approach to regularization parameter selection for Glasso. MCPeSe combines the scalability and low computational cost of the frequentist Glasso with the ability to automatically choose the regularization by Bayesian Glasso modeling. MCPeSe provides a state-of-the-art ‘tuning-free’ model selection criterion for Glasso and allows exploration of the posterior probability distribution of the tuning parameter. Availability and implementation R source code of MCPeSe, a step by step example showing how to apply MCPeSe and a collection of scripts used to prepare the material in this article are publicly available at GitHub under GPL (https://github.com/markkukuismin/MCPeSe/). Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Edward P. Herbst ◽  
Frank Schorfheide

This chapter provides a self-contained review of Bayesian inference and decision making. It begins with a discussion of Bayesian inference for a simple autoregressive (AR) model, which takes the form of a Gaussian linear regression. For this model, the posterior distribution can be characterized analytically and closed-form expressions for its moments are readily available. The chapter also examines how to turn posterior distributions into point estimates, interval estimates, forecasts, and how to solve general decision problems. The chapter shows how in a Bayesian setting, the calculus of probability is used to characterize and update an individual's state of knowledge or degree of beliefs with respect to quantities such as model parameters or future observations.


2019 ◽  
Vol 19 (1) ◽  
pp. 36-45
Author(s):  
Johnny van Doorn ◽  
Dora Matzke ◽  
Eric-Jan Wagenmakers

Sir Ronald Fisher’s venerable experiment “The Lady Tasting Tea” is revisited from a Bayesian perspective. We demonstrate how a similar tasting experiment, conducted in a classroom setting, can familiarize students with several key concepts of Bayesian inference, such as the prior distribution, the posterior distribution, the Bayes factor, and sequential analysis.


2016 ◽  
Vol 28 (8) ◽  
pp. 1503-1526 ◽  
Author(s):  
Yanping Huang ◽  
Rajesh P. N. Rao

Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.


Sign in / Sign up

Export Citation Format

Share Document