A flexible, efficient binomial mixed model for identifying differential DNA methylation in bisulfite sequencing data

Mapping Intimacies ◽

10.1101/019562 ◽

2015 ◽

Author(s):

Amanda J Lea ◽

Jenny Tung ◽

Xiang Zhou

Keyword(s):

Dna Methylation ◽

Count Data ◽

Mixed Model ◽

Bisulfite Sequencing ◽

Structured Populations ◽

P Value ◽

Sequencing Data ◽

Data Set ◽

Age Related ◽

Bisulfite Sequencing Data

Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html.

Download Full-text

A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data

PLoS Genetics ◽

10.1371/journal.pgen.1005650 ◽

2015 ◽

Vol 11 (11) ◽

pp. e1005650 ◽

Cited By ~ 54

Author(s):

Amanda J. Lea ◽

Jenny Tung ◽

Xiang Zhou

Keyword(s):

Dna Methylation ◽

Mixed Model ◽

Bisulfite Sequencing ◽

Sequencing Data ◽

Bisulfite Sequencing Data

Download Full-text

LuxUS: DNA Methylation Analysis Using Generalized Linear Mixed Model with Spatial Correlation

10.1101/536722 ◽

2019 ◽

Cited By ~ 2

Author(s):

Viivi Halla-aho ◽

Harri Lähdesmäki

Keyword(s):

Dna Methylation ◽

Spatial Correlation ◽

Mixed Model ◽

Bisulfite Sequencing ◽

Linear Mixed Model ◽

Generalized Linear Mixed Model ◽

Statistical Testing ◽

Supplementary Information ◽

Sequencing Data ◽

Bisulfite Sequencing Data

AbstractMotivationDNA methylation is an important epigenetic modification, which has multiple functions. DNA methylation and its connections to diseases have been extensively studied in recent years. It is known that DNA methylation levels of neighboring cytosines are correlated and that differential DNA methylation typically occurs rather as regions instead of individual cytosine level.ResultsWe have developed a generalized linear mixed model, LuxUS, that makes use of the correlation between neighboring cytosines to facilitate analysis of differential methylation. LuxUS implements a likelihood model for bisulfite sequencing data that accounts for experimental variation in underlying biochemistry. LuxUS can model both binary and continuous covariates, and mixed model formulation enables including replicate and cytosine random effects. Spatial correlation is included to the model through a cytosine random effect correlation structure. We show with simulation experiments that by utilizing the spatial correlation we gain more power to the statistical testing of differential DNA methylation. Results with real bisulfite sequencing data set show that LuxUS is able to detect biologically significant differentially methylated cytosines.AvailabilityThe tool is available at https://github.com/hallav/LuxUS.Supplementary informationSupplementary data are available at bioRxiv.

Download Full-text

A Bayesian Approach for Analysis of Whole-Genome Bisulfite Sequencing Data Identifies Disease-Associated Changes in DNA Methylation

Genetics ◽

10.1534/genetics.116.195008 ◽

2017 ◽

Vol 205 (4) ◽

pp. 1443-1458 ◽

Cited By ~ 10

Author(s):

Owen J. L. Rackham ◽

Sarah R. Langley ◽

Thomas Oates ◽

Eleni Vradi ◽

Nathan Harmston ◽

...

Keyword(s):

Dna Methylation ◽

Bayesian Approach ◽

Bisulfite Sequencing ◽

Whole Genome ◽

Sequencing Data ◽

Whole Genome Bisulfite Sequencing ◽

Genome Bisulfite Sequencing ◽

Bisulfite Sequencing Data

Download Full-text

DNA methylation analysis using bisulfite sequencing data

Computational Genomics with R ◽

10.1201/9780429084317-10 ◽

2020 ◽

pp. 367-392

Author(s):

Altuna Akalin

Keyword(s):

Dna Methylation ◽

Bisulfite Sequencing ◽

Sequencing Data ◽

Methylation Analysis ◽

Bisulfite Sequencing Data ◽

Dna Methylation Analysis

Download Full-text

Focal disruption of DNA methylation dynamics at enhancers in IDH-mutant AML cells

Leukemia ◽

10.1038/s41375-021-01476-y ◽

2021 ◽

Author(s):

Elisabeth R. Wilson ◽

Nichole M. Helton ◽

Sharon E. Heath ◽

Robert S. Fulton ◽

Jacqueline E. Payton ◽

...

Keyword(s):

Dna Methylation ◽

Myeloid Leukemia ◽

Bisulfite Sequencing ◽

Sequencing Data ◽

Genome Wide ◽

Cd34 Cells ◽

Recurrent Mutations ◽

Genome Bisulfite Sequencing ◽

Bisulfite Sequencing Data ◽

Acute Myeloid

AbstractRecurrent mutations in IDH1 or IDH2 in acute myeloid leukemia (AML) are associated with increased DNA methylation, but the genome-wide patterns of this hypermethylation phenotype have not been comprehensively studied in AML samples. We analyzed whole-genome bisulfite sequencing data from 15 primary AML samples with IDH1 or IDH2 mutations, which identified ~4000 focal regions that were uniquely hypermethylated in IDHmut samples vs. normal CD34+ cells and other AMLs. These regions had modest hypermethylation in AMLs with biallelic TET2 mutations, and levels of 5-hydroxymethylation that were diminished in IDH and TET-mutant samples, indicating that this hypermethylation results from inhibition of TET-mediated demethylation. Focal hypermethylation in IDHmut AMLs occurred at regions with low methylation in CD34+ cells, implying that DNA methylation and demethylation are active at these loci. AML samples containing IDH and DNMT3AR882 mutations were significantly less hypermethylated, suggesting that IDHmut-associated hypermethylation is mediated by DNMT3A. IDHmut-specific hypermethylation was highly enriched for enhancers that form direct interactions with genes involved in normal hematopoiesis and AML, including MYC and ETV6. These results suggest that focal hypermethylation in IDH-mutant AML occurs by altering the balance between DNA methylation and demethylation, and that disruption of these pathways at enhancers may contribute to AML pathogenesis.

Download Full-text

Development of a method for identifying and functionally analyzing allele-specific DNA methylation based on BS-seq data

Epigenomics ◽

10.2217/epi-2019-0023 ◽

2019 ◽

Vol 11 (15) ◽

pp. 1679-1692

Author(s):

Jiang Zhu ◽

Mu Su ◽

Yue Gu ◽

Xingda Zhang ◽

Wenhua Lv ◽

...

Keyword(s):

Dna Methylation ◽

High Throughput ◽

Bisulfite Sequencing ◽

The Cancer Genome Atlas ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Genome Bisulfite Sequencing ◽

Bisulfite Sequencing Data ◽

Allele Specific ◽

New Perspective

Aim: To comprehensively identify allele-specific DNA methylation (ASM) at the genome-wide level. Methods: Here, we propose a new method, called GeneASM, to identify ASM using high-throughput bisulfite sequencing data in the absence of haplotype information. Results: A total of 2194 allele-specific DNA methylated genes were identified in the GM12878 lymphocyte lineage using GeneASM. These genes are mainly enriched in cell cytoplasm function, subcellular component movement or cellular linkages. GM12878 methylated DNA immunoprecipitation sequencing, and methylation sensitive restriction enzyme sequencing data were used to evaluate ASM. The relationship between ASM and disease was further analyzed using the The Cancer Genome Atlas (TCGA) data of lung adenocarcinoma (LUAD), and whole genome bisulfite sequencing data. Conclusion: GeneASM, which recognizes ASM by high-throughput bisulfite sequencing and heterozygous single-nucleotide polymorphisms, provides new perspective for studying genomic imprinting.

Download Full-text

BSPAT: a fast online tool for DNA methylation co-occurrence pattern analysis based on high-throughput bisulfite sequencing data

BMC Bioinformatics ◽

10.1186/s12859-015-0649-2 ◽

2015 ◽

Vol 16 (1) ◽

Cited By ~ 15

Author(s):

Ke Hu ◽

Angela H. Ting ◽

Jing Li

Keyword(s):

Dna Methylation ◽

High Throughput ◽

Bisulfite Sequencing ◽

Pattern Analysis ◽

Sequencing Data ◽

Online Tool ◽

Bisulfite Sequencing Data ◽

Occurrence Pattern

Download Full-text

Bisulfite sequencing Data Presentation and Compilation (BDPC) web server--a useful tool for DNA methylation analysis

Nucleic Acids Research ◽

10.1093/nar/gkn083 ◽

2008 ◽

Vol 36 (5) ◽

pp. e34-e34 ◽

Cited By ~ 44

Author(s):

C. Rohde ◽

Y. Zhang ◽

T. P. Jurkowski ◽

H. Stamerjohanns ◽

R. Reinhardt ◽

...

Keyword(s):

Dna Methylation ◽

Bisulfite Sequencing ◽

Web Server ◽

Sequencing Data ◽

Methylation Analysis ◽

Data Presentation ◽

Bisulfite Sequencing Data ◽

Dna Methylation Analysis

Download Full-text

Maximizing ecological and evolutionary insight from bisulfite sequencing data sets

10.1101/091488 ◽

2016 ◽

Author(s):

Amanda J. Lea ◽

Tauras P. Vilgalys ◽

Paul A.P. Durst ◽

Jenny Tung

Keyword(s):

Dna Methylation ◽

Population Structure ◽

Effect Size ◽

Evolutionary Biology ◽

Bisulfite Sequencing ◽

Published Data ◽

Data Sets ◽

Cell Type ◽

Sequencing Data ◽

Bisulfite Sequencing Data

AbstractThe role of DNA methylation in development, divergence, and the response to environmental stimuli is of substantial interest in ecology and evolutionary biology. Measuring genome-wide DNA methylation is increasingly feasible using sodium bisulfite sequencing. Here, we analyze simulated and published data sets to demonstrate how effect size, kinship/population structure, taxonomic differences, and cell type heterogeneity influence the power to detect differential methylation in bisulfite sequencing data sets. Our results reveal that the effect sizes typical of evolutionary and ecological studies are modest, and will thus require data sets larger than those currently in common use. Additionally, our findings emphasize that statistical approaches that ignore the properties of bisulfite sequencing data (e.g., its count-based nature) or key sources of variance in natural populations (e.g., population structure or cell type heterogeneity) often produce false negatives or false positives, thus leading to incorrect biological conclusions. Finally, we provide recommendations for handling common issues that arise in bisulfite sequencing analyses and a freely available R Shiny application for simulating and performing power analyses on bisulfite sequencing data. This app, available at www.tung-lab.org/protocols-and-software.html, allows users to explore the effects of sequencing depth, sample size, population structure, and expected effect size, tailored to their own system.

Download Full-text

LuxRep: a technical replicate-aware method for bisulfite sequencing data analysis

10.1101/444711 ◽

2018 ◽

Cited By ~ 1

Author(s):

Maia Malonzo ◽

Viivi Halla-aho ◽

Mikko Konki ◽

Riikka J. Lund ◽

Harri Lähdesmäki

Keyword(s):

Dna Methylation ◽

Data Analysis ◽

Bisulfite Sequencing ◽

Bisulfite Conversion ◽

Sequencing Data ◽

Dna Libraries ◽

Conversion Rates ◽

Bisulfite Sequencing Data ◽

Low Efficiency ◽

Sequencing Data Analysis

AbstractDNA methylation is measured using bisulfite sequencing (BS-seq). Bisulfite conversion can have low efficiency and a DNA sample is then processed multiple times generating DNA libraries with different bisulfite conversion rates. Libraries with low conversion rates are excluded from analysis resulting in reduced coverage and increased costs. We present a method and software, LuxRep, that accounts for technical replicates from different bisulfite-converted DNA libraries. We show that including replicates with low bisulfite conversion rates generates more accurate estimates of methylation levels and differentially methylated sites.AvailabilityAn implementation of the method is available at https://github.com/tare/LuxGLM/tree/master/[email protected]

Download Full-text