scholarly journals Impact of regulatory variation across human iPSCs and differentiated cells

2016 ◽  
Author(s):  
Nicholas E. Banovich ◽  
Yang I. Li ◽  
Anil Raj ◽  
Michelle C. Ward ◽  
Peyton Greenside ◽  
...  

AbstractInduced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation across different cell types and as models for studies of complex disease. We established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression, chromatin accessibility and DNA methylation. Regulatory variation between individuals is lower in iPSCs than in the differentiated cell types, consistent with the intuition that developmental processes are generally canalized. While most cell type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell type-specific QTLs are in shared open chromatin. Finally, we developed a deep neural network to predict open chromatin regions from DNA sequence alone and were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell type-specific chromatin accessibility.

Author(s):  
Chaitanya Srinivasan ◽  
BaDoi N. Phan ◽  
Alyssa J. Lawler ◽  
Easwaran Ramamurthy ◽  
Michael Kleyman ◽  
...  

ABSTRACTRecent large genome-wide association studies (GWAS) have identified multiple confident risk loci linked to addiction-associated behavioral traits. Genetic variants linked to addiction-associated traits lie largely in non-coding regions of the genome, likely disrupting cis-regulatory element (CRE) function. CREs tend to be highly cell type-specific and may contribute to the functional development of the neural circuits underlying addiction. Yet, a systematic approach for predicting the impact of risk variants on the CREs of specific cell populations is lacking. To dissect the cell types and brain regions underlying addiction-associated traits, we applied LD score regression to compare GWAS to genomic regions collected from human and mouse assays for open chromatin, which is associated with CRE activity. We found enrichment of addiction-associated variants in putative regulatory elements marked by open chromatin in neuronal (NeuN+) nuclei collected from multiple prefrontal cortical areas and striatal regions known to play major roles in reward and addiction. To further dissect the cell type-specific basis of addiction-associated traits, we also identified enrichments in human orthologs of open chromatin regions of mouse neuron subtypes: cortical excitatory, PV, D1, and D2. Lastly, we developed machine learning models from mouse cell type-specific regions of open chromatin to further dissect human NeuN+ open chromatin regions into cortical excitatory or striatal D1 and D2 neurons and predict the functional impact of addiction-associated genetic variants. Our results suggest that different neuron subtypes within the reward system play distinct roles in the variety of traits that contribute to addiction.Significance StatementOur study on cell types and brain regions contributing to heritability of addiction-associated traits suggests that the conserved non-coding regions within cortical excitatory and striatal medium spiny neurons contribute to genetic predisposition for nicotine, alcohol, and cannabis use behaviors. This computational framework can flexibly integrate epigenomic data across species to screen for putative causal variants in a cell type- and tissue-specific manner across numerous complex traits.


2018 ◽  
Author(s):  
Meaghan J Jones ◽  
Louie Dinh ◽  
Hamid Reza Razzaghian ◽  
Olivia de Goede ◽  
Julia L MacIsaac ◽  
...  

AbstractBackgroundDNA methylation profiling of peripheral blood leukocytes has many research applications, and characterizing the changes in DNA methylation of specific white blood cell types between newborn and adult could add insight into the maturation of the immune system. As a consequence of developmental changes, DNA methylation profiles derived from adult white blood cells are poor references for prediction of cord blood cell types from DNA methylation data. We thus examined cell-type specific differences in DNA methylation in leukocyte subsets between cord and adult blood, and assessed the impact of these differences on prediction of cell types in cord blood.ResultsThough all cell types showed differences between cord and adult blood, some specific patterns stood out that reflected how the immune system changes after birth. In cord blood, lymphoid cells showed less variability than in adult, potentially demonstrating their naïve status. In fact, cord CD4 and CD8 T cells were so similar that genetic effects on DNA methylation were greater than cell type effects in our analysis, and CD8 T cell frequencies remained difficult to predict, even after optimizing the library used for cord blood composition estimation. Myeloid cells showed fewer changes between cord and adult and also less variability, with monocytes showing the fewest sites of DNA methylation change between cord and adult. Finally, including nucleated red blood cells in the reference library was necessary for accurate cell type predictions in cord blood.ConclusionChanges in DNA methylation with age were highly cell type specific, and those differences paralleled what is known about the maturation of the postnatal immune system.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Natalie M. Clark ◽  
Eli Buckner ◽  
Adam P. Fisher ◽  
Emily C. Nelson ◽  
Thomas T. Nguyen ◽  
...  

AbstractStem cells are responsible for generating all of the differentiated cells, tissues, and organs in a multicellular organism and, thus, play a crucial role in cell renewal, regeneration, and organization. A number of stem cell type-specific genes have a known role in stem cell maintenance, identity, and/or division. Yet, how genes expressed across different stem cell types, referred to here as stem-cell-ubiquitous genes, contribute to stem cell regulation is less understood. Here, we find that, in the Arabidopsis root, a stem-cell-ubiquitous gene, TESMIN-LIKE CXC2 (TCX2), controls stem cell division by regulating stem cell-type specific networks. Development of a mathematical model of TCX2 expression allows us to show that TCX2 orchestrates the coordinated division of different stem cell types. Our results highlight that genes expressed across different stem cell types ensure cross-communication among cells, allowing them to divide and develop harmonically together.


2020 ◽  
Vol 29 (11) ◽  
pp. 1922-1932
Author(s):  
Priyanka Nandakumar ◽  
Dongwon Lee ◽  
Thomas J Hoffmann ◽  
Georg B Ehret ◽  
Dan Arking ◽  
...  

Abstract Hundreds of loci have been associated with blood pressure (BP) traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ~100 000 Genetic Epidemiology Research on Aging study participants. In the present study, we sought to fine-map known loci and identify novel genes by determining putative regulatory regions for these and other tissues relevant to BP. We constructed maps of putative cis-regulatory elements (CREs) using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. We aggregate variants within these putative CREs within 50 Kb of the start or end of ‘expressed’ genes in these tissues or cell types using public expression data and use deltaSVM scores as weights in the group-wise sequence kernel association test to identify candidates. We test for association with both BP traits and expression within these tissues or cell types of interest and identify the candidates MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B and PPCDC. Additionally, we examined two known QT interval genes, SCN5A and NOS1AP, in the Atherosclerosis Risk in Communities Study, as a positive control, and observed the expected heart-specific effect. Thus, our method identifies variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.


2019 ◽  
Author(s):  
Priyanka Nandakumar ◽  
Dongwon Lee ◽  
Thomas J. Hoffmann ◽  
Georg B. Ehret ◽  
Dan Arking ◽  
...  

AbstractHundreds of loci have been associated with blood pressure traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ∼100,000 Genetic Epidemiology Research on Aging (GERA) study participants. In the present study, we subsequently focused on determining putative regulatory regions for these and other tissues of relevance to blood pressure, to both fine-map these loci by pinpointing genes and variants of functional interest within them, and to identify any novel genes.We constructed maps of putative cis-regulatory elements using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Sequence variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. In order to identify genes of interest, we aggregate these variants in these putative cis-regulatory elements within 50Kb of the start or end of genes considered as “expressed” in these tissues or cell types using publicly available gene expression data, and use the deltaSVM scores as weights in the well-known group-wise sequence kernel association test (SKAT). We test for association with both blood pressure traits as well as expression within these tissues or cell types of interest, and identify several genes, including MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B, and PPCDC. Although our study centers on blood pressure traits, we additionally examined two known genes, SCN5A and NOS1AP involved in the cardiac trait QT interval, in the Atherosclerosis Risk in Communities Study (ARIC), as a positive control, and observed an expected heart-specific effect. Thus, our method may be used to identify variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.Author SummarySequence change in genes (“variants”) are linked to the presence and severity of different traits or diseases. However, as genes may be expressed in different tissues and at different times and degrees, using this information is expected to more accurately identify genes of interest. Variants within the genes are essential, but also in the sequences (“regulatory elements”) that control the genes’ expression in different tissues or cell types. In this study, we aim to use this information about expression and variants potentially involved in gene expression regulation to better pinpoint genes and variants in regulatory elements of interest for blood pressure regulation. We do so by taking advantage of such data that are publicly available, and use methods to combine information about variants in aggregate within a gene’s putative regulatory elements in tissues thought to be relevant for blood pressure, and identify several genes, meant to enable experimental follow-up.


Author(s):  
Zhong Wang ◽  
Alexandra G. Chivu ◽  
Lauren A. Choate ◽  
Edward J. Rice ◽  
Donald C. Miller ◽  
...  

AbstractWe trained a sensitive machine learning tool to infer the distribution of histone marks using maps of nascent transcription. Transcription captured the variation in active histone marks and complex chromatin states, like bivalent promoters, down to single-nucleosome resolution and at an accuracy that rivaled the correspondence between independent ChIP-seq experiments. The relationship between active histone marks and transcription was conserved in all cell types examined, allowing individual labs to annotate active functional elements in mammals with similar richness as major consortia. Using imputation as an interpretative tool uncovered cell-type specific differences in how the PRC2-dependent repressive mark, H3K27me3, corresponds to transcription, and revealed that transcription initiation requires both chromatin accessibility and an active chromatin environment demonstrating that initiation is less promiscuous than previously thought.


2019 ◽  
Author(s):  
Qi Song ◽  
Jiyoung Lee ◽  
Shamima Akter ◽  
Ruth Grene ◽  
Song Li

AbstractRecent advances in genomic technologies have generated large-scale protein-DNA interaction data and open chromatic regions for multiple plant species. To predict condition specific gene regulatory networks using these data, we developed the Condition Specific Regulatory network inference engine (ConSReg), which combines heterogeneous genomic data using sparse linear model followed by feature selection and stability selection to select key regulatory genes. Using Arabidopsis as a model system, we constructed maps of gene regulation under more than 50 experimental conditions including abiotic stresses, cell type-specific expression, and stress responses in individual cell types. Our results show that ConSReg accurately predicted gene expressions (average auROC of 0.84) across multiple testing datasets. We found that, (1) including open chromatin information from ATAC-seq data significantly improves the performance of ConSReg across all tested datasets; (2) choice of negative training samples and length of promoter regions are two key factors that affect model performance. We applied ConSReg to Arabidopsis single cell RNA-seq data of two root cell types (endodermis and cortex) and identified five regulators in two root cell types. Four out of the five regulators have additional experimental evidence to support their roles in regulating gene expression in Arabidopsis roots. By comparing regulatory maps in abiotic stress responses and cell type-specific experiments, we revealed that transcription factors that regulate tissue levels abiotic stresses tend to also regulate stress responses in individual cell types in plants.


2021 ◽  
Author(s):  
Risa Karakida Kawaguchi ◽  
Ziqi Tang ◽  
Stephan Fischer ◽  
Rohit Tripathy ◽  
Peter K. Koo ◽  
...  

Background: Single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) measures genome-wide chromatin accessibility for the discovery of cell-type specific regulatory networks. ScATAC-seq combined with single-cell RNA sequencing (scRNA-seq) offers important avenues for ongoing research, such as novel cell-type specific activation of enhancer and transcription factor binding sites as well as chromatin changes specific to cell states. On the other hand, scATAC-seq data is known to be challenging to interpret due to its high number of zeros as well as the heterogeneity derived from different protocols. Because of the stochastic lack of marker gene activities, cell type identification by scATAC-seq remains difficult even at a cluster level. Results: In this study, we exploit reference knowledge obtained from external scATAC-seq or scRNA-seq datasets to define existing cell types and uncover the genomic regions which drive cell-type specific gene regulation. To investigate the robustness of existing cell-typing methods, we collected 7 scATAC-seq datasets targeting mouse brain for a meta-analytic comparison of neuronal cell-type annotation, including a reference atlas generated by the BRAIN Initiative Cell Census Network (BICCN). By comparing the area under the receiver operating characteristics curves (AUROCs) for the three major cell types (inhibitory, excitatory, and non-neuronal cells), cell-typing performance by single markers is found to be highly variable even for known marker genes due to study-specific biases. However, the signal aggregation of a large and redundant marker gene set, optimized via multiple scRNA-seq data, achieves the highest cell-typing performances among 5 existing marker gene sets, from the individual cell to cluster level. That gene set also shows a high consistency with the cluster-specific genes from inhibitory subtypes in two well-annotated datasets, suggesting applicability to rare cell types. Next, we demonstrate a comprehensive assessment of scATAC-seq cell typing using exhaustive combinations of the marker gene sets with supervised learning methods including machine learning classifiers and joint clustering methods. Our results show that the combinations using robust marker gene sets systematically ranked at the top, not only with model based prediction using a large reference data but also with a simple summation of expression strengths across markers. To demonstrate the utility of this robust cell typing approach, we trained a deep neural network to predict chromatin accessibility in each subtype using only DNA sequence. Through model interpretation methods, we identify key motifs enriched about robust gene sets for each neuronal subtype. Conclusions: Through the meta-analytic evaluation of scATAC-seq cell-typing methods, we develop a novel method set to exploit the BICCN reference atlas. Our study strongly supports the value of robust marker gene selection as a feature selection tool and cross-dataset comparison between scATAC-seq datasets to improve alignment of scATAC-seq to known biology. With this novel, high quality epigenetic data, genomic analysis of regulatory regions can reveal sequence motifs that drive cell type-specific regulatory programs.


2019 ◽  
Author(s):  
Sylvia Hilliard ◽  
Renfang Song ◽  
Hongbing Liu ◽  
Chao-hui Chen ◽  
Yuwen Li ◽  
...  

ABSTRACTSix2+ cap mesenchyme cells, also called nephrons progenitor cells (NPC), are precursors of all epithelial cell types of the nephron, the filtering unit of the kidney. Current evidence indicates that perinatal “old” NPC have a greater tendency to exit the progenitor niche and differentiate into nascent nephrons than their embryonic “young” counterpart. Understanding the underpinnings of NPC aging may offer insights to rejuvenate old NPC and expand the progenitor pool. Here, we compared the chromatin landscape of young and old NPC and found common features reflecting their shared lineage but also intrinsic differences in chromatin accessibility and enhancer landscape supporting the view that old NPC are epigenetically poised for differentiation. Annotation of open chromatin regions and active enhancers uncovered the transcription factor Bach2 as a potential link between the pro-renewal MAPK/AP1 and pro-differentiation Six2/b-catenin pathways that might be of critical importance in regulation of NPC fate. Our data provide the first glimpse of the dynamic chromatin landscape of NPC and serve as a platform for future studies of the impact of genetic or environmental perturbations on the epigenome of NPC.Summary statementHilliard et al. investigated the chromatin landscape of native Six2+ nephron progenitors across their lifespan. They identified age-dependent changes in accessible chromatin and regulatory regions supporting the view that old nephron progenitors are epigenetically poised for differentiation.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Yujuan Gui ◽  
Kamil Grzyb ◽  
Mélanie H. Thomas ◽  
Jochen Ohnmacht ◽  
Pierre Garcia ◽  
...  

Abstract Background Cell types in ventral midbrain are involved in diseases with variable genetic susceptibility, such as Parkinson’s disease and schizophrenia. Many genetic variants affect regulatory regions and alter gene expression in a cell-type-specific manner depending on the chromatin structure and accessibility. Results We report 20,658 single-nuclei chromatin accessibility profiles of ventral midbrain from two genetically and phenotypically distinct mouse strains. We distinguish ten cell types based on chromatin profiles and analysis of accessible regions controlling cell identity genes highlights cell-type-specific key transcription factors. Regulatory variation segregating the mouse strains manifests more on transcriptome than chromatin level. However, cell-type-level data reveals changes not captured at tissue level. To discover the scope and cell-type specificity of cis-acting variation in midbrain gene expression, we identify putative regulatory variants and show them to be enriched at differentially expressed loci. Finally, we find TCF7L2 to mediate trans-acting variation selectively in midbrain neurons. Conclusions Our data set provides an extensive resource to study gene regulation in mesencephalon and provides insights into control of cell identity in the midbrain and identifies cell-type-specific regulatory variation possibly underlying phenotypic and behavioural differences between mouse strains.


Sign in / Sign up

Export Citation Format

Share Document