Properties of Markov Chain Monte Carlo Performance across Many Empirical Alignments

Molecular Biology and Evolution ◽

10.1093/molbev/msaa295 ◽

2020 ◽

Author(s):

Sean M Harrington ◽

Van Wishingrad ◽

Robert C Thomson

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Posterior Distribution ◽

Common Knowledge ◽

Simulated Data ◽

Mcmc Methods ◽

Rate Variation ◽

Data Sets ◽

Mcmc Convergence

Abstract Nearly all current Bayesian phylogenetic applications rely on Markov chain Monte Carlo (MCMC) methods to approximate the posterior distribution for trees and other parameters of the model. These approximations are only reliable if Markov chains adequately converge and sample from the joint posterior distribution. Although several studies of phylogenetic MCMC convergence exist, these have focused on simulated data sets or select empirical examples. Therefore, much that is considered common knowledge about MCMC in empirical systems derives from a relatively small family of analyses under ideal conditions. To address this, we present an overview of commonly applied phylogenetic MCMC diagnostics and an assessment of patterns of these diagnostics across more than 18,000 empirical analyses. Many analyses appeared to perform well and failures in convergence were most likely to be detected using the average standard deviation of split frequencies, a diagnostic that compares topologies among independent chains. Different diagnostics yielded different information about failed convergence, demonstrating that multiple diagnostics must be employed to reliably detect problems. The number of taxa and average branch lengths in analyses have clear impacts on MCMC performance, with more taxa and shorter branches leading to more difficult convergence. We show that the usage of models that include both Γ-distributed among-site rate variation and a proportion of invariable sites is not broadly problematic for MCMC convergence but is also unnecessary. Changes to heating and the usage of model-averaged substitution models can both offer improved convergence in some cases, but neither are a panacea.

Download Full-text

Markov Chain Monte Carlo Methods for State-Space Models with Point Process Observations

Neural Computation ◽

10.1162/neco_a_00281 ◽

2012 ◽

Vol 24 (6) ◽

pp. 1462-1486 ◽

Cited By ~ 10

Author(s):

Ke Yuan ◽

Mark Girolami ◽

Mahesan Niranjan

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

State Space ◽

Point Process ◽

Large Data ◽

State Space Models ◽

Superior Performance ◽

Mcmc Methods ◽

Data Sets

This letter considers how a number of modern Markov chain Monte Carlo (MCMC) methods can be applied for parameter estimation and inference in state-space models with point process observations. We quantified the efficiencies of these MCMC methods on synthetic data, and our results suggest that the Reimannian manifold Hamiltonian Monte Carlo method offers the best performance. We further compared such a method with a previously tested variational Bayes method on two experimental data sets. Results indicate similar performance on the large data sets and superior performance on small ones. The work offers an extensive suite of MCMC algorithms evaluated on an important class of models for physiological signal analysis.

Download Full-text

Efficient Markov Chain Monte Carlo Methods for Decoding Neural Spike Trains

Neural Computation ◽

10.1162/neco_a_00059 ◽

2011 ◽

Vol 23 (1) ◽

pp. 46-96 ◽

Cited By ~ 30

Author(s):

Yashar Ahmadian ◽

Jonathan W. Pillow ◽

Liam Paninski

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Mutual Information ◽

Posterior Distribution ◽

Spike Trains ◽

Mcmc Methods ◽

Average Error ◽

Model Parameters ◽

Wide Range

Stimulus reconstruction or decoding methods provide an important tool for understanding how sensory and motor information is represented in neural activity. We discuss Bayesian decoding methods based on an encoding generalized linear model (GLM) that accurately describes how stimuli are transformed into the spike trains of a group of neurons. The form of the GLM likelihood ensures that the posterior distribution over the stimuli that caused an observed set of spike trains is log concave so long as the prior is. This allows the maximum a posteriori (MAP) stimulus estimate to be obtained using efficient optimization algorithms. Unfortunately, the MAP estimate can have a relatively large average error when the posterior is highly nongaussian. Here we compare several Markov chain Monte Carlo (MCMC) algorithms that allow for the calculation of general Bayesian estimators involving posterior expectations (conditional on model parameters). An efficient version of the hybrid Monte Carlo (HMC) algorithm was significantly superior to other MCMC methods for gaussian priors. When the prior distribution has sharp edges and corners, on the other hand, the “hit-and-run” algorithm performed better than other MCMC methods. Using these algorithms, we show that for this latter class of priors, the posterior mean estimate can have a considerably lower average error than MAP, whereas for gaussian priors, the two estimators have roughly equal efficiency. We also address the application of MCMC methods for extracting nonmarginal properties of the posterior distribution. For example, by using MCMC to calculate the mutual information between the stimulus and response, we verify the validity of a computationally efficient Laplace approximation to this quantity for gaussian priors in a wide range of model parameters; this makes direct model-based computation of the mutual information tractable even in the case of large observed neural populations, where methods based on binning the spike train fail. Finally, we consider the effect of uncertainty in the GLM parameters on the posterior estimators.

Download Full-text

Mapping-Linked Quantitative Trait Loci Using Bayesian Analysis and Markov Chain Monte Carlo Algorithms

Genetics ◽

10.1093/genetics/146.2.735 ◽

1997 ◽

Vol 146 (2) ◽

pp. 735-743 ◽

Cited By ~ 1

Author(s):

Pekka Uimari ◽

Ina Hoeschele

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Quantitative Trait Loci ◽

Quantitative Trait ◽

Allele Frequencies ◽

Mcmc Methods ◽

Substitution Effects ◽

Indicator Variable ◽

Trait Loci

A Bayesian method for mapping linked quantitative trait loci (QTL) using multiple linked genetic markers is presented. Parameter estimation and hypothesis testing was implemented via Markov chain Monte Carlo (MCMC) algorithms. Parameters included were allele frequencies and substitution effects for two biallelic QTL, map positions of the QTL and markers, allele frequencies of the markers, and polygenic and residual variances. Missing data were polygenic effects and multi-locus marker-QTL genotypes. Three different MCMC schemes for testing the presence of a single or two linked QTL on the chromosome were compared. The first approach includes a model indicator variable representing two unlinked QTL affecting the trait, one linked and one unlinked QTL, or both QTL linked with the markers. The second approach incorporates an indicator variable for each QTL into the model for phenotype, allowing or not allowing for a substitution effect of a QTL on phenotype, and the third approach is based on model determination by reversible jump MCMC. Methods were evaluated empirically by analyzing simulated granddaughter designs. All methods identified correctly a second, linked QTL and did not reject the one-QTL model when there was only a single QTL and no additional or an unlinked QTL.

Download Full-text

Markov Chain Monte Carlo

Bayesian Models ◽

10.23943/princeton/9780691159287.003.0007 ◽

2015 ◽

Author(s):

N. Thompson Hobbs ◽

Mevin B. Hooten

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Bayesian Analysis ◽

Posterior Distribution ◽

Mcmc Algorithm ◽

Bayesian Analyses ◽

Seminal Paper ◽

Formal Treatment ◽

High Level

This chapter explains how to implement Bayesian analyses using the Markov chain Monte Carlo (MCMC) algorithm, a set of methods for Bayesian analysis made popular by the seminal paper of Gelfand and Smith (1990). It begins with an explanation of MCMC with a heuristic, high-level treatment of the algorithm, describing its operation in simple terms with a minimum of formalism. In this first part, the chapter explains the algorithm so that all readers can gain an intuitive understanding of how to find the posterior distribution by sampling from it. Next, the chapter offers a somewhat more formal treatment of how MCMC is implemented mathematically. Finally, this chapter discusses implementation of Bayesian models via two routes—by using software and by writing one's own algorithm.

Download Full-text

Markov Chain Monte Carlo (MCMC) Methods

Dictionary of Statistics & Methodology ◽

10.4135/9781412983907.n1129 ◽

2015 ◽

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Mcmc Methods

Download Full-text

Signal processing special issue on Markov Chain Monte Carlo (MCMC) methods for signal processing

Signal Processing ◽

10.1016/s0165-1684(00)00186-9 ◽

2001 ◽

Vol 81 (1) ◽

pp. 1-2 ◽

Cited By ~ 1

Author(s):

Jean-Yves Tourneret ◽

Olive Cappé

Keyword(s):

Signal Processing ◽

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Mcmc Methods ◽

Special Issue

Download Full-text

Markov chain Monte Carlo inference for Markov jump processes via the linear noise approximation

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2011.0541 ◽

2013 ◽

Vol 371 (1984) ◽

pp. 20110541 ◽

Cited By ~ 22

Author(s):

Vassilios Stathopoulos ◽

Mark A. Girolami

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Jump Processes ◽

Mcmc Methods ◽

Computationally Efficient ◽

Markov Jump ◽

Riemann Manifold ◽

Exact Inference ◽

Markov Jump Processes

Bayesian analysis for Markov jump processes (MJPs) is a non-trivial and challenging problem. Although exact inference is theoretically possible, it is computationally demanding, thus its applicability is limited to a small class of problems. In this paper, we describe the application of Riemann manifold Markov chain Monte Carlo (MCMC) methods using an approximation to the likelihood of the MJP that is valid when the system modelled is near its thermodynamic limit. The proposed approach is both statistically and computationally efficient whereas the convergence rate and mixing of the chains allow for fast MCMC inference. The methodology is evaluated using numerical simulations on two problems from chemical kinetics and one from systems biology.

Download Full-text

Measuring Stellar Radial Velocity using Markov Chain Monte Carlo(MCMC) Method

Proceedings of the International Astronomical Union ◽

10.1017/s1743921313007060 ◽

2013 ◽

Vol 9 (S298) ◽

pp. 441-441

Author(s):

Yihan Song ◽

Ali Luo ◽

Yongheng Zhao

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Probability Distribution ◽

Radial Velocity ◽

Mcmc Methods ◽

Mcmc Method ◽

Stellar Spectra ◽

Mcmc Simulation ◽

Template Fitting

AbstractStellar radial velocity is estimated by using template fitting and Markov Chain Monte Carlo(MCMC) methods. This method works on the LAMOST stellar spectra. The MCMC simulation generates a probability distribution of the RV. The RV error can also computed from distribution.

Download Full-text

Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2008.0178 ◽

2008 ◽

Vol 363 (1512) ◽

pp. 3955-3964 ◽

Cited By ~ 52

Author(s):

Mark Pagel ◽

Andrew Meade

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Mixture Model ◽

Branch Length ◽

Monte Carlo Algorithm ◽

Morphological Data ◽

Reversible Jump ◽

Rate Variation ◽

Ancestral State

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon—known as heterotachy—can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our ‘pattern-heterogeneity’ mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of ‘significance’ such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Download Full-text