scholarly journals Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array

2016 ◽  
Author(s):  
Jean-Philippe Fortin ◽  
Timothy J. Triche ◽  
Kasper D. Hansen

AbstractThe minfi package is widely used for analyzing Illumina DNA methylation array data. Here we describe modifications to the minfi package required to support the HumanMethylationEPIC (”EPIC”) array from Illumina. We discuss methods for the joint analysis and normalization of data from the HumanMethylation450 (”450k”) and EPIC platforms. We also introduce the single-sample Noob (ssNoob) method, a normalization procedure suitable for incremental preprocessing of individual Human-Methylation arrays. Our results recommend the ssNoob method when integrating data from multiple generations of Infinium methylation arrays. Finally, we show how to use reference 450k datasets to estimate cell type composition of samples on EPIC arrays. The cumulative effect of these updates is to ensure that minfi provides the tools to best integrate existing and forthcoming Illumina methylation array data.

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 1281 ◽  
Author(s):  
Jovana Maksimovic ◽  
Belinda Phipson ◽  
Alicia Oshlack

Methylation in the human genome is known to be associated with development and disease. The Illumina Infinium methylation arrays are by far the most common way to interrogate methylation across the human genome. This paper provides a Bioconductor workflow using multiple packages for the analysis of methylation array data. Specifically, we demonstrate the steps involved in a typical differential methylation analysis pipeline including: quality control, filtering, normalization, data exploration and statistical testing for probe-wise differential methylation. We further outline other analyses such as differential methylation of regions, differential variability analysis, estimating cell type composition and gene ontology testing. Finally, we provide some examples of how to visualise methylation array data.


2020 ◽  
Author(s):  
Benjamin Chidester ◽  
Tianming Zhou ◽  
Jian Ma

AbstractSpatial transcriptomics technologies promise to reveal spatial relationships of cell-type composition in complex tissues. However, the development of computational methods that capture the unique properties of single-cell spatial transcriptome data to unveil cell identities remains a challenge. Here, we report SpiceMix, a new probabilistic model that enables effective joint analysis of spatial information and gene expression of single cells based on spatial transcriptome data. Both simulation and real data evaluations demonstrate that SpiceMix consistently improves upon the inference of the intrinsic cell types compared with existing approaches. As a proof-of-principle, we use SpiceMix to analyze single-cell spatial transcriptome data of the mouse primary visual cortex acquired by seqFISH+ and STARmap. We find that SpiceMix can improve cell identity assignments and uncover potentially new cell subtypes. SpiceMix is a generalizable framework for analyzing spatial transcriptome data that may provide critical insights into the cell-type composition and spatial organization of cells in complex tissues.


2016 ◽  
Author(s):  
Jovana Maksimovic ◽  
Belinda Phipson ◽  
Alicia Oshlack

AbstractMethylation in the human genome is known to be associated with development and disease. The Illumina Infinium methylation arrays are by far the most common way to interrogate methylation across the human genome. This paper provides a Bioconductor workflow using multiple packages for the analysis of methylation array data. Specifically, we demonstrate the steps involved in a typical differential methylation analysis pipeline including: quality control, filtering, normalization, data exploration and statistical testing for probe-wise differential methylation. We further outline other analyses such as differential methylation of regions, differential variability analysis, estimating cell type composition and gene ontology testing. Finally, we provide some examples of how to visualise methylation array data.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 1281 ◽  
Author(s):  
Jovana Maksimovic ◽  
Belinda Phipson ◽  
Alicia Oshlack

Methylation in the human genome is known to be associated with development and disease. The Illumina Infinium methylation arrays are by far the most common way to interrogate methylation across the human genome. This paper provides a Bioconductor workflow using multiple packages for the analysis of methylation array data. Specifically, we demonstrate the steps involved in a typical differential methylation analysis pipeline including: quality control, filtering, normalization, data exploration and statistical testing for probe-wise differential methylation. We further outline other analyses such as differential methylation of regions, differential variability analysis, estimating cell type composition and gene ontology testing. Finally, we provide some examples of how to visualise methylation array data.


F1000Research ◽  
2017 ◽  
Vol 5 ◽  
pp. 1281 ◽  
Author(s):  
Jovana Maksimovic ◽  
Belinda Phipson ◽  
Alicia Oshlack

Methylation in the human genome is known to be associated with development and disease. The Illumina Infinium methylation arrays are by far the most common way to interrogate methylation across the human genome. This paper provides a Bioconductor workflow using multiple packages for the analysis of methylation array data. Specifically, we demonstrate the steps involved in a typical differential methylation analysis pipeline including: quality control, filtering, normalization, data exploration and statistical testing for probe-wise differential methylation. We further outline other analyses such as differential methylation of regions, differential variability analysis, estimating cell type composition and gene ontology testing. Finally, we provide some examples of how to visualise methylation array data.


2019 ◽  
Author(s):  
Zeran Li ◽  
Fabiana G. Farias ◽  
Umber Dube ◽  
Jorge L. Del-Aguila ◽  
Kathie A. Mihindukulasuriya ◽  
...  

AbstractBackgroundIn previous studies, we observed decreased neuronal and increased astrocyte proportions in AD cases in parietal brain cortex by using a deconvolution method for bulk RNA-seq. These findings suggested that genetic risk factors associated with AD etiology have a specific effect in the cellular composition of AD brains. The goal of this study is to investigate if there are genetic determinants for brain cell compositions.MethodsUsing cell type composition inferred from transcriptome as a disease status proxy, we performed cell type association analysis to identify novel loci related to cellular population changes in disease cohort. We imputed and merged genotyping data from seven studies in total of 1,669 samples and derived major CNS cell type proportions from cortical RNAseq data. We also inferred RNA transcript integrity number (TIN) to account for RNA quality variances. The model we performed in the analysis was: normalized neuronal proportion ∼ SNP + Age + Gender + PC1 + PC2 + median TIN.ResultsA variant rs1990621 located in the TMEM106B gene region was significantly associated with neuronal proportion (p=6.40×10−07) and replicated in an independent dataset. The association became more significant as we combined both discovery and replication datasets in multi-tissue meta-analysis (p=9.42×10−09) and joint analysis (p=7.66×10−10). This variant is in high LD with rs1990622 (r2 = 0.98) which was previously identified as a protective variant in FTD cohorts. Further analyses indicated that this variant is associated with increased neuronal proportion in participants with neurodegenerative disorders, not only in AD cohort but also in cognitive normal elderly cohort. However, this effect was not observed in a younger schizophrenia cohort with a mean age of death < 65. The second most significant loci for neuron proportion was APOE, which suggested that using neuronal proportion as an informative endophenotype could help identify loci associated with neurodegeneration.ConclusionThis result suggested a common pathway involving TMEM106B shared by aging groups in the present or absence of neurodegenerative pathology may contribute to cognitive preservation and neuronal protection.


2019 ◽  
Vol 48 (D1) ◽  
pp. D890-D895 ◽  
Author(s):  
Zhuang Xiong ◽  
Mengwei Li ◽  
Fei Yang ◽  
Yingke Ma ◽  
Jian Sang ◽  
...  

Abstract Epigenome-Wide Association Study (EWAS) has become an effective strategy to explore epigenetic basis of complex traits. Over the past decade, a large amount of epigenetic data, especially those sourced from DNA methylation array, has been accumulated as the result of numerous EWAS projects. We present EWAS Data Hub (https://bigd.big.ac.cn/ewas/datahub), a resource for collecting and normalizing DNA methylation array data as well as archiving associated metadata. The current release of EWAS Data Hub integrates a comprehensive collection of DNA methylation array data from 75 344 samples and employs an effective normalization method to remove batch effects among different datasets. Accordingly, taking advantages of both massive high-quality DNA methylation data and standardized metadata, EWAS Data Hub provides reference DNA methylation profiles under different contexts, involving 81 tissues/cell types (that contain 25 brain parts and 25 blood cell types), six ancestry categories, and 67 diseases (including 39 cancers). In summary, EWAS Data Hub bears great promise to aid the retrieval and discovery of methylation-based biomarkers for phenotype characterization, clinical treatment and health care.


PLoS ONE ◽  
2016 ◽  
Vol 11 (1) ◽  
pp. e0147519 ◽  
Author(s):  
Yuh Shiwa ◽  
Tsuyoshi Hachiya ◽  
Ryohei Furukawa ◽  
Hideki Ohmomo ◽  
Kanako Ono ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document