Analyzing mixing systems using a new generation of Bayesian tracer mixing models

PeerJ ◽

10.7717/peerj.5096 ◽

2018 ◽

Vol 6 ◽

pp. e5096 ◽

Cited By ~ 122

Author(s):

Brian C. Stock ◽

Andrew L. Jackson ◽

Eric J. Ward ◽

Andrew C. Parnell ◽

Donald L. Phillips ◽

...

Keyword(s):

R Package ◽

Mixing Model ◽

Information Criteria ◽

Alligator Mississippiensis ◽

Mixing Models ◽

Model Framework ◽

Prior Specification ◽

Model Study ◽

Study Designs ◽

New Generation

The ongoing evolution of tracer mixing models has resulted in a confusing array of software tools that differ in terms of data inputs, model assumptions, and associated analytic products. Here we introduce MixSIAR, an inclusive, rich, and flexible Bayesian tracer (e.g., stable isotope) mixing model framework implemented as an open-source R package. Using MixSIAR as a foundation, we provide guidance for the implementation of mixing model analyses. We begin by outlining the practical differences between mixture data error structure formulations and relate these error structures to common mixing model study designs in ecology. Because Bayesian mixing models afford the option to specify informative priors on source proportion contributions, we outline methods for establishing prior distributions and discuss the influence of prior specification on model outputs. We also discuss the options available for source data inputs (raw data versus summary statistics) and provide guidance for combining sources. We then describe a key advantage of MixSIAR over previous mixing model software—the ability to include fixed and random effects as covariates explaining variability in mixture proportions and calculate relative support for multiple models via information criteria. We present a case study of Alligator mississippiensis diet partitioning to demonstrate the power of this approach. Finally, we conclude with a discussion of limitations to mixing model applications. Through MixSIAR, we have consolidated the disparate array of mixing model tools into a single platform, diversified the set of available parameterizations, and provided developers a platform upon which to continue improving mixing model analyses in the future.

Download Full-text

Analyzing mixing systems using a new generation of Bayesian tracer mixing models

10.7287/peerj.preprints.26884v1 ◽

2018 ◽

Cited By ~ 2

Author(s):

Brian C Stock ◽

Andrew L Jackson ◽

Eric J Ward ◽

Andrew C Parnell ◽

Donald L Phillips ◽

...

Keyword(s):

R Package ◽

Mixing Model ◽

Information Criteria ◽

Alligator Mississippiensis ◽

Mixing Models ◽

Model Framework ◽

Prior Specification ◽

Model Study ◽

Study Designs ◽

New Generation

The ongoing evolution of tracer mixing models has resulted in a confusing array of software tools that differ in terms of data inputs, model assumptions, and associated analytic products. Here we introduce MixSIAR, an inclusive, rich, and flexible Bayesian tracer (e.g. stable isotope) mixing model framework implemented as an open-source R package. Using MixSIAR as a foundation, we provide guidance for the implementation of mixing model analyses. We begin by outlining the practical differences between mixture data error structure formulations and relate these error structures to common mixing model study designs in ecology. Because Bayesian mixing models afford the option to specify informative priors on source proportion contributions, we outline methods for establishing prior distributions and discuss the influence of prior specification on model outputs. We also discuss the options available for source data inputs (raw data versus summary statistics) and provide guidance for combining sources. We then describe a key advantage of MixSIAR over previous mixing model software—the ability to include fixed and random effects as covariates explaining variability in mixture proportions and calculate relative support for multiple models via information criteria. We present a case study of Alligator mississippiensis diet partitioning to demonstrate the power of this approach. Finally, we conclude with a discussion of limitations to mixing model applications. Through MixSIAR, we have consolidated the disparate array of mixing model tools into a single platform, diversified the set of available parameterizations, and provided developers a platform upon which to continue improving mixing model analyses in the future.

Download Full-text

Peer Review #1 of "Analyzing mixing systems using a new generation of Bayesian tracer mixing models (v0.1)"

10.7287/peerj.5096v0.1/reviews/1 ◽

2018 ◽

Author(s):

JC Nifong

Keyword(s):

Peer Review ◽

Mixing Models ◽

New Generation

Download Full-text

Does the Signal-to-Noise Paradox Exist in Subseasonal Predictions?

10.5194/egusphere-egu21-6198 ◽

2021 ◽

Author(s):

Wei Zhang ◽

Baoqiang Xiang ◽

Ben Kirtman ◽

Emily Becker

Keyword(s):

Signal To Noise Ratio ◽

Current Model ◽

Model Development ◽

Coupled Model ◽

Seasonal Prediction ◽

Signal To Noise ◽

Model Framework ◽

Noise Ratio ◽

Decadal Climate ◽

New Generation

<p>One of the emerging topics in climate prediction is the issue of the so-called &#8220;signal-to-noise paradox&#8221;, characterized by too small signal-to-noise ratio in current model predictions that cannot reproduce the realistic signal. Recent studies have suggested that seasonal-to-decadal climate can be more predictable than ever expected due to the paradox. But no studies, to the best of our knowledge, have been focused on whether the signal-to-noise paradox exists in subseasonal predictions. The present study seeks to address the existence of the paradox in subseasonal predictions based on (i) coupled model simulations participating in phase 5 and phase 6 of the Coupled Model Intercomparison Project (CMIP5 and CMIP6, respectively), and (ii) subseasonal hindcast outputs from the Subseasonal Experiment (SubX) and the Subseasonal-to-Seasonal Prediction (S2S) projects. Of particular interest is the possible existence of the paradox in the new generation of GFDL SPEAR model, through the diagnosis of which may help identify potential issues in the new forecast system to guide future model development and initialization. Here we investigate the paradox issue using two methods: the ratio of predictable component defined as the ratio of predictable component in the real world to the signal-to-noise ratio in models and the persistence/dispersion characteristics estimated from a Markov model framework. The preliminary results suggest a potentially widespread occurrence of the signal-to-noise paradox in subseasonal predictions, further implying some room for improvement in future ensemble-based subseasonal predictions.</p>

Download Full-text

multiHiCcompare: joint normalization and comparative analysis of complex Hi-C experiments

Bioinformatics ◽

10.1093/bioinformatics/btz048 ◽

2019 ◽

Vol 35 (17) ◽

pp. 2916-2923 ◽

Cited By ~ 15

Author(s):

John C Stansfield ◽

Kellen G Cresswell ◽

Mikhail G Dozmorov

Keyword(s):

Comparative Analysis ◽

A Priori ◽

Three Dimensional ◽

R Package ◽

Supplementary Information ◽

Chromatin Interaction ◽

Model Framework ◽

Chromatin Interactions ◽

Loess Regression ◽

Sequencing Studies

Abstract Motivation With the development of chromatin conformation capture technology and its high-throughput derivative Hi-C sequencing, studies of the three-dimensional interactome of the genome that involve multiple Hi-C datasets are becoming available. To account for the technology-driven biases unique to each dataset, there is a distinct need for methods to jointly normalize multiple Hi-C datasets. Previous attempts at removing biases from Hi-C data have made use of techniques which normalize individual Hi-C datasets, or, at best, jointly normalize two datasets. Results Here, we present multiHiCcompare, a cyclic loess regression-based joint normalization technique for removing biases across multiple Hi-C datasets. In contrast to other normalization techniques, it properly handles the Hi-C-specific decay of chromatin interaction frequencies with the increasing distance between interacting regions. multiHiCcompare uses the general linear model framework for comparative analysis of multiple Hi-C datasets, adapted for the Hi-C-specific decay of chromatin interaction frequencies. multiHiCcompare outperforms other methods when detecting a priori known chromatin interaction differences from jointly normalized datasets. Applied to the analysis of auxin-treated versus untreated experiments, and CTCF depletion experiments, multiHiCcompare was able to recover the expected epigenetic and gene expression signatures of loss of chromatin interactions and reveal novel insights. Availability and implementation multiHiCcompare is freely available on GitHub and as a Bioconductor R package https://bioconductor.org/packages/multiHiCcompare. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

sandbox – Creating and Analysing Synthetic Sediment Sections with R

10.5194/gchron-2021-39 ◽

2021 ◽

Author(s):

Michael Dietze ◽

Sebastian Kreutzer ◽

Margret C. Fuchs ◽

Sascha Meszner

Keyword(s):

Grain Size ◽

R Package ◽

Model Framework ◽

Proxy Data ◽

Virtual Samples ◽

Modern Analogue ◽

Uncertainty Estimates ◽

Research Questions ◽

Reduced Complexity ◽

Consistent Approach

Abstract. The majority of palaeoenvironmental information is inferred from proxy data contained in accretionary sediments, called geo-archives. The validity of proxy data and analysis workflows are usually assumed implicitly, with systematic tests and uncertainty estimates restricted to modern analogue studies or reduced-complexity case studies. However, a more generic and consistent approach to exploring the validity and variability of proxy functions would be to translate a given geo-archive into a model scenario: a "virtual twin". Here, we introduce a conceptual framework and numerical toolset that allows the definition and analysis of synthetic sediment sections. The R package sandbox describes arbitrary stratigraphically consistent deposits by depth-dependent rules and grain-specific parameters, allowing full scalability and flexibility. Virtual samples can be taken, resulting in discrete grain-mixtures with well-defined parameters. These samples can then be virtually prepared and analysed, for example to test hypotheses. We illustrate the concept of sandbox, explain how a sediment section can be mapped into the model and, by focusing on an exemplary field of application, we explore universal geochronological research questions related to the effects of sample geometry and grain-size specific age inheritance. We summarise further application scenarios of the model framework, relevant for but not restricted to the broader geochronological community.

Download Full-text

coxphMIC: An R Package for Sparse Estimation of Cox Proportional Hazards Models via Approximated Information Criteria

The R Journal ◽

10.32614/rj-2017-018 ◽

2017 ◽

Vol 9 (1) ◽

pp. 229

Author(s):

Razieh Nabi ◽

Xiaogang Su

Keyword(s):

Proportional Hazards ◽

R Package ◽

Cox Proportional Hazards ◽

Information Criteria ◽

Sparse Estimation ◽

Proportional Hazards Models ◽

Cox Proportional Hazards Models

Download Full-text

cytometree: a binary tree algorithm for automatic gating in cytometry analysis

10.1101/335554 ◽

2018 ◽

Cited By ~ 1

Author(s):

Daniel Commenges ◽

Chariff Alkhassim ◽

Raphael Gottardo ◽

Boris Hejblum ◽

Rodolphe Thiébaut

Keyword(s):

Flow Cytometry ◽

Binary Tree ◽

Computation Time ◽

R Package ◽

Information Criteria ◽

Supplementary Information ◽

Flow Cytometry Data ◽

Human Immunology ◽

Supplementary Material ◽

Unsupervised Algorithms

AbstractMotivationFlow cytometry is a powerful technology that allows the high-throughput quantification of dozens of surface and intracellular proteins at the single-cell level. It has become the most widely used technology for immunophenotyping of cells over the past three decades. Due to the increasing complexity of cytometry experiments (more cells and more markers), traditional manual flow cytometry data analysis has become untenable due to its subjectivity and time-consuming nature.ResultsWe present a new unsupervised algorithm called “cytometree” to perform automated population discovery (aka gating) in flow cytometry. cytometree is based on the construction of a binary tree, the nodes of which are subpopulations of cells. At each node, the marker distributions are modeled by mixtures of normal distribution. Node splitting is done according to a normalized difference of Akaike information criteria (AIC) between the two models. Post-processing of the tree structure and derived populations allows us to complete the annotation of the derived populations. The algorithm is shown to perform better than the state-of-the-art unsupervised algorithms previously proposed on panels introduced by the Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP I) project. The algorithm is also applied to a T-cell panel proposed by the Human Immunology Project Consortium (HIPC) program; it also outperforms the best unsupervised open-source available algorithm while requiring the shortest computation time.AvailabilityAn R package named “cytometree” is available on the CRAN [email protected]; [email protected] informationSupplementary data are available.

Download Full-text

On the Consideration of Diffusive Fluxes Within High-Pressure Injections

Notes on Numerical Fluid Mechanics and Multidisciplinary Design - Future Space-Transport-System Components under High Thermal and Mechanical Loads ◽

10.1007/978-3-030-53847-7_12 ◽

2020 ◽

pp. 195-208

Author(s):

Fabian Föll ◽

Valerie Gerber ◽

Claus-Dieter Munz ◽

Berhand Weigand ◽

Grazia Lamanna

Keyword(s):

Experimental Data ◽

High Pressure ◽

Significant Influence ◽

Speed Of Sound ◽

Mixing Model ◽

Mixing Models ◽

Relative Importance ◽

Submerged Jets ◽

Diffusive Fluxes ◽

Sound Data

Abstract Mixing characteristics of supercritical injection studies were analyzed with regard to the necessity to include diffusive fluxes. Therefore, speed of sound data from mixing jets were investigated using an adiabatic mixing model and compared to an analytic solution. In this work, we show that the generalized application of the adiabatic mixing model may become inappropriate for subsonic submerged jets at high-pressure conditions. Two cases are discussed where thermal and concentration driven fluxes are seen to have significant influence. To which extent the adiabatic mixing model is valid depends on the relative importance of local diffusive fluxes, namely Fourier, Fick and Dufour diffusion. This is inter alia influenced by different time and length scales. The experimental data from a high-pressure n-hexane/nitrogen jet injection were investigated numerically. Finally, based on recent numerical findings, the plausibility of different thermodynamic mixing models for binary mixtures under high pressure conditions is analyzed.

Download Full-text

BAMM-SC: A Bayesian mixture model for clustering droplet-based single cell transcriptomic data from population studies

10.1101/392662 ◽

2018 ◽

Author(s):

Zhe Sun ◽

Li Chen ◽

Hongyi Xin ◽

Qianhui Huang ◽

Anthony R Cillo ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

R Package ◽

Clustering Methods ◽

Model Framework ◽

Bayesian Hierarchical ◽

Bayesian Mixture ◽

Population Scale ◽

Cell Transcriptome ◽

Single Cell Transcriptome

AbstractThe recently developed droplet-based single cell transcriptome sequencing (scRNA-seq) technology makes it feasible to perform a population-scale scRNA-seq study, in which the transcriptome is measured for tens of thousands of single cells from multiple individuals. Despite the advances of many clustering methods, there are few tailored methods for population-scale scRNA-seq studies. Here, we have developed a BAyesiany Mixture Model for Single Cell sequencing (BAMM-SC) method to cluster scRNA-seq data from multiple individuals simultaneously. Specifically, BAMM-SC takes raw data as input and can account for data heterogeneity and batch effect among multiple individuals in a unified Bayesian hierarchical model framework. Results from extensive simulations and application of BAMM-SC to in-house scRNA-seq datasets using blood, lung and skin cells from humans or mice demonstrated that BAMM-SC outperformed existing clustering methods with improved clustering accuracy and reduced impact from batch effects. BAMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/~Cwec47/singlecell.html.

Download Full-text