scholarly journals Correcting for cell-type composition bias in epigenome-wide association studies

10.1186/gm540 ◽  
2014 ◽  
Vol 6 (3) ◽  
pp. 23 ◽  
Author(s):  
Robert Lowe ◽  
Vardhman K Rakyan
2021 ◽  
Vol 12 ◽  
Author(s):  
Shivanthan Shanthikumar ◽  
Melanie R. Neeland ◽  
Richard Saffery ◽  
Sarath C. Ranganathan ◽  
Alicia Oshlack ◽  
...  

In epigenome-wide association studies analysing DNA methylation from samples containing multiple cell types, it is essential to adjust the analysis for cell type composition. One well established strategy for achieving this is reference-based cell type deconvolution, which relies on knowledge of the DNA methylation profiles of purified constituent cell types. These are then used to estimate the cell type proportions of each sample, which can then be incorporated to adjust the association analysis. Bronchoalveolar lavage is commonly used to sample the lung in clinical practice and contains a mixture of different cell types that can vary in proportion across samples, affecting the overall methylation profile. A current barrier to the use of bronchoalveolar lavage in DNA methylation-based research is the lack of reference DNA methylation profiles for each of the constituent cell types, thus making reference-based cell composition estimation difficult. Herein, we use bronchoalveolar lavage samples collected from children with cystic fibrosis to define DNA methylation profiles for the four most common and clinically relevant cell types: alveolar macrophages, granulocytes, lymphocytes and alveolar epithelial cells. We then demonstrate the use of these methylation profiles in conjunction with an established reference-based methylation deconvolution method to estimate the cell type composition of two different tissue types; a publicly available dataset derived from artificial blood-based cell mixtures and further bronchoalveolar lavage samples. The reference DNA methylation profiles developed in this work can be used for future reference-based cell type composition estimation of bronchoalveolar lavage. This will facilitate the use of this tissue in studies examining the role of DNA methylation in lung health and disease.


Epigenomics ◽  
2020 ◽  
Author(s):  
Yen-Chen A Feng ◽  
Yichen Guo ◽  
Lucile Pain ◽  
G Mark Lathrop ◽  
Catherine Laprise ◽  
...  

Aim: To develop a method for estimating cell-specific effects in epigenomic association studies in the presence of cell type heterogeneity. Materials & methods: We utilized Monte Carlo Expectation-Maximization (MCEM) algorithm with Metropolis–Hastings sampler to reconstruct the ‘missing’ cell-specific methylations and to estimate their associations with phenotypes free of confounding by cell type proportions. Results: Simulations showed reliable performance of the method under various settings including when the cell type is rare. Application to a real dataset recapitulated the directly measured cell-specific methylation pattern in whole blood. Conclusion: This work provides a framework to identify important cell groups and account for cell type composition useful for studying the role of epigenetic changes in human traits and diseases.


2020 ◽  
Author(s):  
Miao Rui ◽  
Dang Qi ◽  
Huang Hai Hui ◽  
Xia Liang Yong ◽  
Yong Liang

Abstract Background: In epigenome-wide association studies (EWAS), the mixed methylation expression caused by the combination of different cell types may lead the researchers to find the false methylation site related to the phenotype of interest. In order to fix this problem, researchers have proposed some non-reference methods based on sparse principle component analysis (PCA) to correct the EWAS false discovery. However, the existing model assumes that all methylation site have the same a priori probability in each PC load, but it is known that there already has network structure in the genetic variable corresponding to the methylation site. In this paper, we show that the results of the existing EWAS correction model are still not good enough. If we can integrate the existing methylation network as prior knowledge into the sparse PCA model, we can effectively improve the correction ability of the existing model. Result: Based on the above ideas, we propose GN-ReFAEWAS, a model which uses the prior methylation gene network structure into the PCA framework for feature extraction. This model can be used to correct the false discovery in EWAS. GN-ReFAEWAS model does not need cell counting data and can estimate cell type composition through methylation principal component data. The key of this model is to solve a sparse regularize problem of methylation network. This paper uses regularize and random sampling algorithm to solve this problem. We used one simulated data set and three real data sets for experiments and compared four existing EWAS calibration models. The experimental results show that the GN-ReFAEWAS model is superior to existing models. Conclusion: The result proved that GN-ReFAEWAS model can provide a better estimation of cell-type composition and reduce the false positives in EWAS.


2017 ◽  
Author(s):  
Shijie C Zheng ◽  
Stephan Beck ◽  
Andrew E. Jaffe ◽  
Devin C. Koestler ◽  
Kasper D. Hansen ◽  
...  

AbstractRecently, a study by Rahmani et al [1] claimed that a reference-free cell-type deconvolution method, called ReFACTor, leads to improved power and improved estimates of cell-type composition compared to competing reference-free and reference-based methods in the context of Epigenome-Wide Association Studies (EWAS). However, we identified many critical flaws (both conceptual and statistical in nature), which seriously question the validity of their claims. We outlined constructive criticism in a recent correspondence letter, Zheng et al [2]. The purpose of this letter is two-fold. First, to present additional analyses, which demonstrate that our original criticism is statistically sound. Second, to highlight additional serious concerns, which Rahmani et al have not yet addressed. In summary, we find that ReFACTor has not been demonstrated to outperform state-of-the-art reference-free methods such as SVA or RefFreeEWAS, nor state-of-the-art reference-based methods. Thus, the claim by Rahmani et al (a claim reiterated in their recent response letter [3]) that ReFACT or represents an advance over the state-of-the-art is not supported by an objective and rigorous statistical analysis of the data.


Author(s):  
Shijie C Zheng ◽  
Charles E Breeze ◽  
Stephan Beck ◽  
Danyue Dong ◽  
Tianyu Zhu ◽  
...  

Abstract Summary It is well recognized that cell-type heterogeneity hampers the interpretation of Epigenome-Wide Association Studies (EWAS). Many tools have emerged to address this issue, including several R/Bioconductor packages that infer cell-type composition. Here we present a web application for cell-type deconvolution, which offers the functionality of our EpiDISH Bioconductor/R package in a user-friendly GUI environment. Users can upload their data to infer cell-type composition and differentially methylated cytosines in individual cell-types (DMCTs) for a range of different tissues. Availability and implementation EpiDISH web server is implemented with Shiny in R, and is freely available at https://www.biosino.org/EpiDISH/.


2014 ◽  
Vol 11 (3) ◽  
pp. 309-311 ◽  
Author(s):  
James Zou ◽  
Christoph Lippert ◽  
David Heckerman ◽  
Martin Aryee ◽  
Jennifer Listgarten

2019 ◽  
Author(s):  
Mike Thompson ◽  
Zeyuan Johnson Chen ◽  
Elior Rahmani ◽  
Eran Halperin

AbstractDNA methylation remains one of the most widely studied epigenetic markers. One of the major challenges in population studies of methylation is the presence of global methylation effects that may mask local signals. Such global effects may be due to either technical effects (e.g., batch effects) or biological effects (e.g., cell-type composition, genetics). Many methods have been developed for the detection of such global effects, typically in the context of epigenome-wide association studies. However, current unsupervised methods do not distinguish between biological and technical effects, resulting in a loss of highly relevant information. Though supervised methods can be used to estimate known biological effects, it remains difficult to identify and estimate unknown biological effects that globally affect the methylome. Here, we proposeCONFINED,a reference-free method based on sparse canonical correlation analysis that captures replicable sources of variation—such as age, sex, and cell-type composition—across multiple methylation datasets and distinguishes them from dataset-specific sources of variability (e.g., technical effects). Consequently, we demonstrate through simulated and real data that by leveraging multiple datasets simultaneously, our approach captures several replicable sources of biological variation better than previous reference-free methods and is considerably more robust to technical noise than previous reference-free methods.CONFINEDis available as an R package as detailed athttps://github.com/cozygene/CONFINED.


Sign in / Sign up

Export Citation Format

Share Document