scholarly journals Correlation of Infinium HumanMethylation450K and MethylationEPIC BeadChip arrays in cartilage

2019 ◽  
Author(s):  
Kathleen Cheung ◽  
Marjolein J. Burgers ◽  
David A. Young ◽  
Simon Cockell ◽  
Louise N. Reynard

AbstractBackgroundDNA methylation of CpG sites is commonly measured using Illumina Infinium BeadChip platforms. The Infinium MethylationEPIC array has replaced the Infinium Methylation450K array. The two arrays use the same technology, with the EPIC array assaying 865859 CpG sites, almost double the number of sites present on the 450K array. In this study, we compare DNA methylation values of shared CpGs of the same human cartilage samples assayed using both platforms.MethodsDNA methylation was measured in 21 human cartilage samples using the Illumina Infinium Methylation450K BeadChip and the Infinium methylationEPIC array. Additional matched 450K and EPIC data in whole tumour and whole blood were downloaded from GEO GSE92580 and GSE86833 respectively. Data were processed using the Bioconductor package Minfi. Additionally, DNA methylation of six CpG sites was validated for the same 21 cartilage samples by use of pyrosequencing.ResultsIn cartilage samples, overall sample correlations between methylation values generated by the two arrays were high (Pearson correlation coefficient r > 0.96). However, 50.5% of CpG sites showed poor correlation (r < 0.2) between arrays. Sites with limited variance and with either very high or very low methylation levels in cartilage exhibited lower correlation values, corroborating prior studies in whole blood. Bisulfite pyrosequencing did not highlight one array as generating more accurate methylation values that the other. For a specific CpG site, the array methylation correlation coefficient differed between cartilage, tumour and whole blood, reflecting the difference in methylation variance between cell types. These patterns can be observed across different tissues with different CpG site variances. When performing differential methylation analysis, the mean probe correlation co-efficient increased with increasing Δβ threshold used.ConclusionCpG sites with low variability within a tissue showed poor reproducibility between arrays. However, variance and thus reproducibility differs across different tissue types. Therefore, researchers should be cautious when analysing methylation of CpG sites that show low methylation variance within the cell type of interest, regardless of platform or method used to assay methylation.

2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Chang Shu ◽  
Xinyu Zhang ◽  
Bradley E. Aouizerat ◽  
Ke Xu

Abstract Background Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium MethylationEPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods Epigenome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000 ng), medium (300–1000 ng), and low (150 ng–300 ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array. Results After quality control, an average of 3,708,550 CpG sites per sample were detected by MC-seq with DNA quantity > 1000 ng. Reproducibility of DNA methylation in MC-seq-detected CpG sites was high among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98–0.99). However, methylation for a small proportion of CpGs (N = 235) differed significantly between the two platforms, with differences in beta values of greater than 0.5. Conclusions Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.


Author(s):  
Chang Shu ◽  
Xinyu Zhang ◽  
Bradley E. Aouizerat ◽  
Ke Xu

Abstract Background: Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium Methylation EPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods: Epienome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000ng), medium (300-1000ng), and low (150ng-300ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array.Results: After quality control, an average of 3,708,550 CpG sites per sample was detected by MC-seq with DNA quantity >1000ng. Reproducibility of MC-seq detected CpG sites was high with strong correlation estimates for CpG methylation among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98~0.99). However, methylation for a small proportion of CpGs (N=235) differed significantly between the two platforms, with differences in beta values of greater than 0.5.Conclusions: Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.


2020 ◽  
Author(s):  
Chang Shu ◽  
Xinyu Zhang ◽  
Bradley E. Aouizerat ◽  
Ke Xu

Abstract Background: Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium Methylation EPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods: Epigenome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000ng), medium (300-1000ng), and low (150ng-300ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array. Results: After quality control, an average of 3,708,550 CpG sites per sample was detected by MC-seq with DNA quantity >1000ng. Reproducibility of MC-seq detected CpG sites was high with strong correlation estimates for CpG methylation among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98~0.99). However, methylation for a small proportion of CpGs (N=235) differed significantly between the two platforms, with differences in beta values of greater than 0.5. Conclusions: Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.


Genes ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 870
Author(s):  
Jiansheng Zhang ◽  
Hongli Fu ◽  
Yan Xu

In recent years, scientists have found a close correlation between DNA methylation and aging in epigenetics. With the in-depth research in the field of DNA methylation, researchers have established a quantitative statistical relationship to predict the individual ages. This work used human blood tissue samples to study the association between age and DNA methylation. We built two predictors based on healthy and disease data, respectively. For the health data, we retrieved a total of 1191 samples from four previous reports. By calculating the Pearson correlation coefficient between age and DNA methylation values, 111 age-related CpG sites were selected. Gradient boosting regression was utilized to build the predictive model and obtained the R2 value of 0.86 and MAD of 3.90 years on testing dataset, which were better than other four regression methods as well as Horvath’s results. For the disease data, 354 rheumatoid arthritis samples were retrieved from a previous study. Then, 45 CpG sites were selected to build the predictor and the corresponded MAD and R2 were 3.11 years and 0.89 on the testing dataset respectively, which showed the robustness of our predictor. Our results were better than the ones from other four regression methods. Finally, we also analyzed the twenty-four common CpG sites in both healthy and disease datasets which illustrated the functional relevance of the selected CpG sites.


2017 ◽  
Author(s):  
John Dou ◽  
Rebecca J. Schmidt ◽  
Kelly S. Benke ◽  
Craig Newschaffer ◽  
Irva Hertz-Picciotto ◽  
...  

AbstractBackgroundCord blood DNA methylation is associated with numerous health outcomes and environmental exposures. Whole cord blood DNA reflects all nucleated blood cell types, while centrifuging whole blood separates red blood cells by generating a white blood cell buffy coat. Both sample types are used in DNA methylation studies. Cell types have unique methylation patterns and processing can impact cell distributions, which may influence comparability.ObjectivesTo evaluate differences in cell composition and DNA methylation between buffy coat and whole cord blood samples.MethodsCord blood DNA methylation was measured with the Infinium EPIC BeadChip (Illumina) in 8 individuals, each contributing buffy coat and whole blood samples. We analyzed principal components (PC) of methylation, performed hierarchical clustering, and computed correlations of mean-centered methylation between pairs. We conducted moderated t-tests on single sites and estimated cell composition.ResultsDNA methylation PCs were associated with individual (PPC1=1.4x10-9; PPC2=2.9x10-5; PPC3=3.8x10-5; PPC4=4.2x10-6; PPC5=9.9x10-13), and not with sample type (PPC1-5>0.7). Samples hierarchically clustered by individual. Pearson correlations of mean-centered methylation between paired individual samples ranged from r=0.66 to r=0.87. No individual site significantly differed between buffy coat and whole cord blood when adjusting for multiple comparisons (5 sites had unadjusted P<10-5). Estimated cell type proportions did not differ by sample type (P=0.86), and estimated cell counts were highly correlated between paired samples (r=0.99).ConclusionsDifferences in methylation and cell composition between buffy coat and whole cord blood are much lower than inter-individual variation, demonstrating that both sample preparation types can be analytically combined and compared.


2018 ◽  
Vol 21 (2) ◽  
pp. 101-111 ◽  
Author(s):  
Maarten Caspers ◽  
Sara Blocquiaux ◽  
Ruben Charlier ◽  
Sara Knaeps ◽  
Johan Lefevre ◽  
...  

The aim of this exploratory study was to investigate how sedentary behavior (SB) and physical activity (PA) influence DNA methylation at a global, gene-specific, and health-related pathway level. SB, light PA (LPA), and moderate-to-vigorous PA (MVPA) were assessed objectively for 41 Flemish men using the SenseWear Pro 3 Armband. CpG site-specific methylation in leukocytes was determined using the Illumina HumanMethylation 450 BeadChip. Correlations were calculated between time spent on the three PA intensity levels and global DNA methylation, using a z-score-based method to determine global DNA methylation levels. To determine whether CpG site-specific methylation can be predicted by these three PA intensity levels, linear regression analyses were performed. Based on the significantly associated CpG sites at α = 0.005, lists were created including all genes with a promoter region overlapping these CpG sites. A biological pathway analysis determined to what extent these genes are overrepresented within several pathways. No significant associations were observed between global DNA methylation and SB (r = 0.084), LPA (r = -0.168), or MVPA (r = -0.125), although the direction of the correlation coefficients is opposite to what is generally reported in literature. SB has a different impact on global and gene-specific methylation than PA, but also LPA and MVPA affect separate genes and pathways. Furthermore, the function of a pathway seems to determine its association with SB, LPA, or MVPA. Multiple PA intensity levels, including SB, should be taken into account in future studies investigating the effect of physical (in)activity on human health through epigenetic mechanisms.


2020 ◽  
Author(s):  
Max T. Aung ◽  
Kelly M. Bakulski ◽  
Jason I. Feinberg ◽  
John F. Dou ◽  
John D. Meeker ◽  
...  

AbstractBackgroundMetals exposures have important health effects in pregnancy. The maternal epigenome may be responsive to these exposures. We tested whether metals are associated with concurrent differential maternal whole blood DNA methylation.MethodsIn the Early Autism Risk Longitudinal Investigation (EARLI) cohort, we measured first or second trimester maternal blood metals concentrations (cadmium, lead, mercury, manganese, and selenium) in 215 participants using inductively coupled plasma mass spectrometry. DNA methylation in maternal whole blood was measured in the same specimens on the Illumina 450K array (201 participants). A subset sample of 97 women had both measures available for analysis, all of whom did not report smoking during pregnancy. Linear regression was used to test for site-specific associations between individual metals and DNA methylation, adjusting for cell type composition and confounding variables. Discovery gene ontology analysis was conducted on the top 1,000 sites associated with each metal to elucidate downstream pathways.ResultsIn multiple linear regression, we observed hypermethylation at 11 DNA methylation sites associated with lead (FDR q-value <0.1), near the genes CYP24A1, ASCL2, FAT1, SNX31, NKX6-2, LRC4C, BMP7, HOXC11, PCDH7, ZSCAN18, and VIPR2. Lead associated sites were enriched (FDR q-value <0.1) for the pathways cell adhesion, nervous system development, and calcium ion binding. Manganese was associated with hypermethylation at four DNA methylation sites (FDR q-value <0.1), one of which was near the gene ARID2. Manganese associated sites were enriched for cellular metabolism pathways (FDR q-value<0.1). Effect estimates for DNA methylation sites associated (p<0.05) with cadmium, lead, and manganese were highly correlated (Pearson ρ >0.86).DiscussionSingle DNA methylation sites associated with lead and manganese may be potential biomarkers of exposure or implicate downstream gene pathways. Future studies should replicate our findings to characterize potential toxicological mechanisms of trace metals through the maternal epigenome.


2018 ◽  
Author(s):  
Xiangyu Luo ◽  
Can Yang ◽  
Yingying Wei

In epigenome-wide association studies, the measured signals for each sample are a mixture of methylation profiles from different cell types. The current approaches to the association detection only claim whether a cytosine-phosphate-guanine (CpG) site is associated with the phenotype or not, but they cannot determine the cell type in which the risk-CpG site is affected by the phenotype. Here, we propose a solid statistical method, HIgh REsolution (HIRE), which not only substantially improves the power of association detection at the aggregated level as compared to the existing methods but also enables the detection of risk-CpG sites for individual cell types.


2021 ◽  
Author(s):  
Brooke J. Smith ◽  
Alexandre A Lussier ◽  
Janine Cerutti ◽  
Daniel J. Schaid ◽  
Andrew J. Simpkin ◽  
...  

Background: Exposure to adversity during childhood is estimated to at least double the risk of depression later in life. Some evidence suggests childhood adversity may have a greater impact on depression risk, if experienced during specific windows of development called sensitive periods. During these sensitive periods, there is evidence that adversity may leave behind biological memories, including changes in DNA methylation (DNAm). Here we ask if those changes play a role in the link between adversity and later adolescent depressive symptoms. Methods: We applied a method for high-dimensional mediation analysis using data from a subsample (n=627-675) of the Avon Longitudinal Study of Parents and Children. We first assessed the possibility of time-dependent relationships between seven types of childhood adversity (caregiver abuse, physical/sexual abuse, maternal psychopathology, one-adult household, family instability, financial stress, neighborhood disadvantage), measured on at least four occasions between ages 0-7 years, and adolescent depression at mean age 10.6. Specifically, we considered three types of life course hypotheses (sensitive periods, accumulation, and recency), and then evaluated which of these hypotheses had the strongest association in each adversity-adolescent depression relationship using the structured life course modeling approach (SLCMA; pronounced slick-mah). To conduct the mediation analyses, we used a combination of pruning and sure independence screening (a dimension reduction method) to reduce the number of methylated CpG sites under consideration to a viable subset for our sample size. We then applied a sparse group lasso penalized model to identify the top mediating loci from that subset using the combined strength of the coefficient measuring the relationship between the childhood adversity and a CpG site (α) and of the coefficient measuring the relationship between the CpG site and depressive symptoms (β) as a metric. Using a Monte Carlo method for assessing mediation (MCMAM), we assigned a significance level and confidence interval to each identified mediator. Results: Across all seven adversities, we identified a total of 70 CpG sites that showed evidence of mediating the relationship between adversity and adolescent depression symptoms. Of these 70 mediators, 37 were significant at the p < 0.05 level when applying the MCMAM, a method tailored to estimating the significance of SEM-derived mediation effects. These sites exhibited four different mediating patterns, differentiated by the direction of α and β. These patterns had signals that were: (1) both positive (19 loci), (2) both negative (18 loci), (3) positive α and negative β (23 loci) or (4) negative α and positive β (10 loci). Conclusion: Our results suggest that DNAm partially mediates the relationship between different types of childhood adversity and depressive symptoms in adolescence. These findings provide insight into the biological mechanisms that link childhood adversity to depression, which will ultimately help develop treatments to prevent depression in more vulnerable populations.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Nicholas D. Johnson ◽  
Xiumei Wu ◽  
Christopher D. Still ◽  
Xin Chu ◽  
Anthony T. Petrick ◽  
...  

Abstract Background Non-alcoholic fatty liver disease (NAFLD) is characterized by changes in cell composition that occur throughout disease pathogenesis, which includes the development of fibrosis in a subset of patients. DNA methylation (DNAm) is a plausible mechanism underlying these shifts, considering that DNAm profiles differ across tissues and cell types, and DNAm may play a role in cell-type differentiation. Previous work investigating the relationship between DNAm and fibrosis in NAFLD has been limited by sample size and the number of CpG sites interrogated. Results Here, we performed an epigenome-wide analysis using Infinium MethylationEPIC array data from 325 individuals with NAFLD, including 119 with severe fibrosis and 206 with no histological evidence of fibrosis. After adjustment for latent confounders, we identified 7 CpG sites whose DNAm associated with fibrosis (p < 5.96 × 10–8). Analysis of RNA-seq data collected from a subset of individuals (N = 56) revealed that gene expression at 288 genes associated with DNAm at one or more of the 7 fibrosis-related CpGs. DNAm-based estimates of cell-type proportions showed that estimated proportions of natural killer cells increased, while epithelial cell proportions decreased with disease stage. Finally, we used an elastic net regression model to assess DNAm as a biomarker of fibrotic stage and found that our model predicted fibrosis with a sensitivity of 0.93 and provided information beyond a model based solely on cell-type proportions. Conclusion These findings are consistent with DNAm as a mechanism underpinning or marking fibrosis-related shifts in cell composition and demonstrate the potential of DNAm as a possible biomarker of NAFLD fibrosis.


Sign in / Sign up

Export Citation Format

Share Document