scholarly journals A flexible, efficient binomial mixed model for identifying differential DNA methylation in bisulfite sequencing data

2015 ◽  
Author(s):  
Amanda J Lea ◽  
Jenny Tung ◽  
Xiang Zhou

Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html.

2019 ◽  
Author(s):  
Viivi Halla-aho ◽  
Harri Lähdesmäki

AbstractMotivationDNA methylation is an important epigenetic modification, which has multiple functions. DNA methylation and its connections to diseases have been extensively studied in recent years. It is known that DNA methylation levels of neighboring cytosines are correlated and that differential DNA methylation typically occurs rather as regions instead of individual cytosine level.ResultsWe have developed a generalized linear mixed model, LuxUS, that makes use of the correlation between neighboring cytosines to facilitate analysis of differential methylation. LuxUS implements a likelihood model for bisulfite sequencing data that accounts for experimental variation in underlying biochemistry. LuxUS can model both binary and continuous covariates, and mixed model formulation enables including replicate and cytosine random effects. Spatial correlation is included to the model through a cytosine random effect correlation structure. We show with simulation experiments that by utilizing the spatial correlation we gain more power to the statistical testing of differential DNA methylation. Results with real bisulfite sequencing data set show that LuxUS is able to detect biologically significant differentially methylated cytosines.AvailabilityThe tool is available at https://github.com/hallav/LuxUS.Supplementary informationSupplementary data are available at bioRxiv.


Leukemia ◽  
2021 ◽  
Author(s):  
Elisabeth R. Wilson ◽  
Nichole M. Helton ◽  
Sharon E. Heath ◽  
Robert S. Fulton ◽  
Jacqueline E. Payton ◽  
...  

AbstractRecurrent mutations in IDH1 or IDH2 in acute myeloid leukemia (AML) are associated with increased DNA methylation, but the genome-wide patterns of this hypermethylation phenotype have not been comprehensively studied in AML samples. We analyzed whole-genome bisulfite sequencing data from 15 primary AML samples with IDH1 or IDH2 mutations, which identified ~4000 focal regions that were uniquely hypermethylated in IDHmut samples vs. normal CD34+ cells and other AMLs. These regions had modest hypermethylation in AMLs with biallelic TET2 mutations, and levels of 5-hydroxymethylation that were diminished in IDH and TET-mutant samples, indicating that this hypermethylation results from inhibition of TET-mediated demethylation. Focal hypermethylation in IDHmut AMLs occurred at regions with low methylation in CD34+ cells, implying that DNA methylation and demethylation are active at these loci. AML samples containing IDH and DNMT3AR882 mutations were significantly less hypermethylated, suggesting that IDHmut-associated hypermethylation is mediated by DNMT3A. IDHmut-specific hypermethylation was highly enriched for enhancers that form direct interactions with genes involved in normal hematopoiesis and AML, including MYC and ETV6. These results suggest that focal hypermethylation in IDH-mutant AML occurs by altering the balance between DNA methylation and demethylation, and that disruption of these pathways at enhancers may contribute to AML pathogenesis.


Epigenomics ◽  
2019 ◽  
Vol 11 (15) ◽  
pp. 1679-1692
Author(s):  
Jiang Zhu ◽  
Mu Su ◽  
Yue Gu ◽  
Xingda Zhang ◽  
Wenhua Lv ◽  
...  

Aim: To comprehensively identify allele-specific DNA methylation (ASM) at the genome-wide level. Methods: Here, we propose a new method, called GeneASM, to identify ASM using high-throughput bisulfite sequencing data in the absence of haplotype information. Results: A total of 2194 allele-specific DNA methylated genes were identified in the GM12878 lymphocyte lineage using GeneASM. These genes are mainly enriched in cell cytoplasm function, subcellular component movement or cellular linkages. GM12878 methylated DNA immunoprecipitation sequencing, and methylation sensitive restriction enzyme sequencing data were used to evaluate ASM. The relationship between ASM and disease was further analyzed using the The Cancer Genome Atlas (TCGA) data of lung adenocarcinoma (LUAD), and whole genome bisulfite sequencing data. Conclusion: GeneASM, which recognizes ASM by high-throughput bisulfite sequencing and heterozygous single-nucleotide polymorphisms, provides new perspective for studying genomic imprinting.


2008 ◽  
Vol 36 (5) ◽  
pp. e34-e34 ◽  
Author(s):  
C. Rohde ◽  
Y. Zhang ◽  
T. P. Jurkowski ◽  
H. Stamerjohanns ◽  
R. Reinhardt ◽  
...  

2016 ◽  
Author(s):  
Amanda J. Lea ◽  
Tauras P. Vilgalys ◽  
Paul A.P. Durst ◽  
Jenny Tung

AbstractThe role of DNA methylation in development, divergence, and the response to environmental stimuli is of substantial interest in ecology and evolutionary biology. Measuring genome-wide DNA methylation is increasingly feasible using sodium bisulfite sequencing. Here, we analyze simulated and published data sets to demonstrate how effect size, kinship/population structure, taxonomic differences, and cell type heterogeneity influence the power to detect differential methylation in bisulfite sequencing data sets. Our results reveal that the effect sizes typical of evolutionary and ecological studies are modest, and will thus require data sets larger than those currently in common use. Additionally, our findings emphasize that statistical approaches that ignore the properties of bisulfite sequencing data (e.g., its count-based nature) or key sources of variance in natural populations (e.g., population structure or cell type heterogeneity) often produce false negatives or false positives, thus leading to incorrect biological conclusions. Finally, we provide recommendations for handling common issues that arise in bisulfite sequencing analyses and a freely available R Shiny application for simulating and performing power analyses on bisulfite sequencing data. This app, available at www.tung-lab.org/protocols-and-software.html, allows users to explore the effects of sequencing depth, sample size, population structure, and expected effect size, tailored to their own system.


2018 ◽  
Author(s):  
Maia Malonzo ◽  
Viivi Halla-aho ◽  
Mikko Konki ◽  
Riikka J. Lund ◽  
Harri Lähdesmäki

AbstractDNA methylation is measured using bisulfite sequencing (BS-seq). Bisulfite conversion can have low efficiency and a DNA sample is then processed multiple times generating DNA libraries with different bisulfite conversion rates. Libraries with low conversion rates are excluded from analysis resulting in reduced coverage and increased costs. We present a method and software, LuxRep, that accounts for technical replicates from different bisulfite-converted DNA libraries. We show that including replicates with low bisulfite conversion rates generates more accurate estimates of methylation levels and differentially methylated sites.AvailabilityAn implementation of the method is available at https://github.com/tare/LuxGLM/tree/master/[email protected]


Sign in / Sign up

Export Citation Format

Share Document