Accurate promoter and enhancer identification in 127 ENCODE and Roadmap Epigenomics cell types and tissues by GenoSTAN

Mapping Intimacies ◽

10.1101/041020 ◽

2016 ◽

Cited By ~ 2

Author(s):

Benedikt Zacher ◽

Margaux Michel ◽

Björn Schwalb ◽

Patrick Cramer ◽

Achim Tresch ◽

...

Keyword(s):

Transcription Factor ◽

Genetic Variants ◽

Hidden Markov ◽

Data Distribution ◽

Complex Trait ◽

Cell Types ◽

Regulatory Elements ◽

Future Research ◽

Dna Accessibility ◽

Enhancer Identification

AbstractAccurate maps of promoters and enhancers are required for understanding transcriptional regulation. Promoters and enhancers are usually mapped by integration of chromatin assays charting histone modifications, DNA accessibility, and transcription factor binding. However, current algorithms are limited by unrealistic data distribution assumptions. Here we propose GenoSTAN (Genomic STate ANnotation), a hidden Markov model overcoming these limitations. We map promoters and enhancers for 127 cell types and tissues from the ENCODE and Roadmap Epigenomics projects, today’s largest compendium of chromatin assays. Extensive benchmarks demonstrate that GenoSTAN consistently identifies promoters and enhancers with significantly higher accuracy than previous methods. Moreover, GenoSTAN-derived promoters and enhancers showed significantly higher enrichment of complex trait-associated genetic variants than current annotations. Altogether, GenoSTAN provides an easy-to-use tool to define promoters and enhancers in any system, and our annotation of human transcriptional cis-regulatory elements constitutes a rich resource for future research in biology and medicine.

Download Full-text

A human cell atlas of fetal chromatin accessibility

Science ◽

10.1126/science.aba7612 ◽

2020 ◽

Vol 370 (6518) ◽

pp. eaba7612 ◽

Cited By ~ 1

Author(s):

Silvia Domcke ◽

Andrew J. Hill ◽

Riza M. Daza ◽

Junyue Cao ◽

Diana R. O’Day ◽

...

Keyword(s):

Gene Expression ◽

Human Cell ◽

Single Cells ◽

Complex Trait ◽

Cell Types ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Cell Type ◽

Cell Type Specific

The chromatin landscape underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of chromatin accessibility and gene expression in fetal tissues. For chromatin accessibility, we devised a three-level combinatorial indexing assay and applied it to 53 samples representing 15 organs, profiling ~800,000 single cells. We leveraged cell types defined by gene expression to annotate these data and cataloged hundreds of thousands of candidate regulatory elements that exhibit cell type–specific chromatin accessibility. We investigated the properties of lineage-specific transcription factors (such as POU2F1 in neurons), organ-specific specializations of broadly distributed cell types (such as blood and endothelial), and cell type–specific enrichments of complex trait heritability. These data represent a rich resource for the exploration of in vivo human gene regulation in diverse tissues and cell types.

Download Full-text

Basic-Zipper-Type Transcription Factor FlbB Controls Asexual Development in Aspergillus nidulans

Eukaryotic Cell ◽

10.1128/ec.00207-07 ◽

2007 ◽

Vol 7 (1) ◽

pp. 38-48 ◽

Cited By ~ 71

Author(s):

Oier Etxebeste ◽

Min Ni ◽

Aitor Garzia ◽

Nak-Jung Kwon ◽

Reinhard Fischer ◽

...

Keyword(s):

Transcription Factor ◽

Nuclear Import ◽

Cell Types ◽

Regulatory Elements ◽

Asexual Development ◽

Quantitative Expression ◽

Functional Relations ◽

Asexual Spore ◽

Hyphal Apex ◽

Fungal Colony

ABSTRACT The fungal colony is a complex multicellular unit consisting of various cell types and functions. Asexual spore formation (conidiation) is integrated through sensory and regulatory elements into the general morphogenetic plan, in which the activation of the transcription factor BrlA is the first determining step. A number of early regulatory elements acting upstream of BrlA (fluG and flbA-E) have been identified, but their functional relations remain to be further investigated. In this report we describe FlbB as a putative basic-zipper-type transcription factor restricted to filamentous fungi. FlbB accumulates at the hyphal apex during early vegetative growth but is later found in apical nuclei, suggesting that an activating modification triggers nuclear import. Moreover, proper temporal and quantitative expression of FlbB is a prerequisite for brlA transcription, and misscheduled overexpression inhibits conidiation. We also present evidence that FlbB activation results in the production of a second diffusible signal, acting downstream from the FluG factor, to induce conidiation.

Download Full-text

Genetic determinants of chromatin accessibility in T cell activation across humans

10.1101/090241 ◽

2016 ◽

Cited By ~ 2

Author(s):

Rachel E. Gate ◽

Christine S. Cheng ◽

Aviva P. Aiden ◽

Atsede Siba ◽

Marcin Tabaka ◽

...

Keyword(s):

Gene Expression ◽

T Cells ◽

T Cell ◽

T Cell Activation ◽

Genetic Variants ◽

Cell Activation ◽

Cell Types ◽

Regulatory Elements ◽

Specific Cell ◽

Accessible Chromatin

AbstractOver 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq) and RNA-seq profiles from activated CD4+ T cells of up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, in patterns consistent with the 3D organization of chromosomes measured by in situ Hi-C in T cells. 15% of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak through disrupting binding sites for transcription factors important for T cell differentiation and activation. These ATAC quantitative trait nucleotides (ATAC-QTNs) have the largest effects on co-accessible peaks, are associated with gene expression from the same aliquot of cells, are rarely affecting core binding motifs, and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis- regulatory elements, in isolation or in concert, to influence gene expression in primary immune cells that play a key role in many human diseases.

Download Full-text

Addiction-associated genetic variants implicate brain cell type- and region-specific cis-regulatory elements in addiction neurobiology

10.1101/2020.09.29.318329 ◽

2020 ◽

Cited By ~ 1

Author(s):

Chaitanya Srinivasan ◽

BaDoi N. Phan ◽

Alyssa J. Lawler ◽

Easwaran Ramamurthy ◽

Michael Kleyman ◽

...

Keyword(s):

Genetic Variants ◽

Cell Types ◽

Regulatory Elements ◽

Brain Regions ◽

Open Chromatin ◽

Genome Wide Association Studies ◽

Cell Type ◽

Coding Regions ◽

Cell Type Specific ◽

The Impact

ABSTRACTRecent large genome-wide association studies (GWAS) have identified multiple confident risk loci linked to addiction-associated behavioral traits. Genetic variants linked to addiction-associated traits lie largely in non-coding regions of the genome, likely disrupting cis-regulatory element (CRE) function. CREs tend to be highly cell type-specific and may contribute to the functional development of the neural circuits underlying addiction. Yet, a systematic approach for predicting the impact of risk variants on the CREs of specific cell populations is lacking. To dissect the cell types and brain regions underlying addiction-associated traits, we applied LD score regression to compare GWAS to genomic regions collected from human and mouse assays for open chromatin, which is associated with CRE activity. We found enrichment of addiction-associated variants in putative regulatory elements marked by open chromatin in neuronal (NeuN+) nuclei collected from multiple prefrontal cortical areas and striatal regions known to play major roles in reward and addiction. To further dissect the cell type-specific basis of addiction-associated traits, we also identified enrichments in human orthologs of open chromatin regions of mouse neuron subtypes: cortical excitatory, PV, D1, and D2. Lastly, we developed machine learning models from mouse cell type-specific regions of open chromatin to further dissect human NeuN+ open chromatin regions into cortical excitatory or striatal D1 and D2 neurons and predict the functional impact of addiction-associated genetic variants. Our results suggest that different neuron subtypes within the reward system play distinct roles in the variety of traits that contribute to addiction.Significance StatementOur study on cell types and brain regions contributing to heritability of addiction-associated traits suggests that the conserved non-coding regions within cortical excitatory and striatal medium spiny neurons contribute to genetic predisposition for nicotine, alcohol, and cannabis use behaviors. This computational framework can flexibly integrate epigenomic data across species to screen for putative causal variants in a cell type- and tissue-specific manner across numerous complex traits.

Download Full-text

ATAC-STARR-seq v1

10.17504/protocols.io.b2nqqddw ◽

2021 ◽

Author(s):

Tyler Hansen ◽

Emily Hodges

Keyword(s):

Transcription Factor ◽

Control Cell ◽

Regulatory Region ◽

Active Regions ◽

Cell Types ◽

Regulatory Elements ◽

Transcription Factor Binding ◽

Specific Gene ◽

Factor Binding ◽

Accessible Chromatin

Transcriptional enhancers control cell-type specific gene expression in humans and dysfunction can lead to debilitating diseases, including cancer. Identifying bona-fide enhancers is difficult due to a lack of spatial or sequence constraints. In addition, only a small percentage of the genome is accessible in matured cell types; and therefore, most enhancers are inactive due to their chromatin context rather than intrinsic properties of the DNA sequence itself. For this reason, we decided to assay regulatory activity exclusively within accessible chromatin. To do this, we combined assay for transposase-accessible chromatin using sequencing (ATAC-seq) with self-transcribing active regulatory region sequencing (STARR-seq); we call this method ATAC-STARR-seq. With ATAC-STARR-seq, we identify both active and silent regulatory elements in GM12878 B cells; these active and silent elements are enriched for transcription factor motifs and histone modifications associated with activating and repressing regulation, respectively. We also show that ATAC-STARR-seq quantifies chromatin accessibility and transcription factor binding. We integrate this information and subset active regions based on transcription factor binding profiles. Depending on the transcription factors bound, subsets are enriched for distinct reactome pathways. Altogether, this highlights the power of ATAC-STARR-seq to investigate the transcriptional regulatory landscape of the human genome.

Download Full-text

IMPACT: Genomic annotation of cell-state-specific regulatory elements inferred from the epigenome of bound transcription factors

10.1101/366864 ◽

2018 ◽

Cited By ~ 1

Author(s):

Tiffany Amariuta ◽

Yang Luo ◽

Steven Gazal ◽

Emma E. Davenport ◽

Bryce van de Geijn ◽

...

Keyword(s):

Transcription Factors ◽

Rna Polymerase Ii ◽

Complex Trait ◽

Cell Types ◽

Regulatory Elements ◽

Open Chromatin ◽

Value Capture ◽

Cell State ◽

A Genome ◽

Significant Enrichment

Despite significant progress in annotating the genome with experimental methods, much of the regulatory noncoding genome remains poorly defined. Here we assert that regulatory elements may be characterized by leveraging local epigenomic signatures at sites where specific transcription factors (TFs) are bound. To link these two identifying features, we introduce IMPACT, a genome annotation strategy which identifies regulatory elements defined by cell-state-specific TF binding profiles, learned from 515 chromatin and sequence annotations. We validate IMPACT using multiple compelling applications. First, IMPACT predicts TF motif binding with high accuracy (average AUC 0.92, s.e. 0.03; across 8 TFs), a significant improvement (all p<6.9e-15) over intersecting motifs with open chromatin (average AUC 0.66, s.e. 0.11). Second, an IMPACT annotation trained on RNA polymerase II is more enriched for peripheral blood cis-eQTL variation (N=3,754) than sequence based annotations, such as promoters and regions around the TSS, (permutation p<1e-3, 25% average increase in enrichment). Third, integration with rheumatoid arthritis (RA) summary statistics from European (N=38,242) and East Asian (N=22,515) populations revealed that the top 5% of CD4+ Treg IMPACT regulatory elements capture 85.7% (s.e. 19.4%) of RA h2 (p<1.6e-5) and that the top 9.8% of Treg IMPACT regulatory elements, consisting of all SNPs with a non-zero annotation value, capture 97.3% (s.e. 18.2%) of RA h2 (p<7.6e-7), the most comprehensive explanation for RA h2 to date. In comparison, the average RA h2 captured by compared CD4+ T histone marks is 42.3% and by CD4+ T specifically expressed gene sets is 36.4%. Finally, integration with RA fine-mapping data (N=27,345) revealed a significant enrichment (2.87, p<8.6e-3) of putatively causal variants across 20 RA associated loci in the top 1% of CD4+ Treg IMPACT regulatory regions. Overall, we find that IMPACT generalizes well to other cell types in identifying complex trait associated regulatory elements.

Download Full-text

Disease-associated genetic variants in the regulatory regions of human genes: mechanisms of action on transcription and genomic resources for dissecting these mechanisms

Vavilov Journal of Genetics and Breeding ◽

10.18699/vj21.003 ◽

2021 ◽

Vol 25 (1) ◽

pp. 18-29

Author(s):

E. V. Ignatieva ◽

E. A. Matrosova

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Genetic Variants ◽

Functional Activity ◽

Binding Sites ◽

Regulatory Elements ◽

Transcription Factor Binding ◽

Omics Technologies ◽

Noncoding Regions ◽

Factor Binding

Whole genome and whole exome sequencing technologies play a very important role in the studies of the genetic aspects of the pathogenesis of various diseases. The ample use of genome-wide and exome-wide association study methodology (GWAS and EWAS) made it possible to identify a large number of genetic variants associated with diseases. This information is accumulated in the databases like GWAS central, GWAS catalog, OMIM, ClinVar, etc. Most of the variants identified by the GWAS technique are located in the noncoding regions of the human genome. According to the ENCODE project, the fraction of regions in the human genome potentially involved in transcriptional control is many times greater than the fraction of coding regions. Thus, genetic variation in noncoding regions of the genome can increase the susceptibility to diseases by disrupting various regulatory elements (promoters, enhancers, silencers, insulator regions, etc.). However, identification of the mechanisms of influence of pathogenic genetic variants on the diseases risk is difficult due to a wide variety of regulatory elements. The present review focuses on the molecular genetic mechanisms by which pathogenic genetic variants affect gene expression. At the same time, attention is concentrated on the transcriptional level of regulation as an initial step in the expression of any gene. A triggering event mediating the effect of a pathogenic genetic variant on the level of gene expression can be, for example, a change in the functional activity of transcription factor binding sites (TFBSs) or DNA methylation change, which, in turn, affects the functional activity of promoters or enhancers. Dissecting the regulatory roles of polymorphic loci have been impossible without close integration of modern experimental approaches with computer analysis of a growing wealth of genetic and biological data obtained using omics technologies. The review provides a brief description of a number of the most well-known public genomic information resources containing data obtained using omics technologies, including (1) resources that accumulate data on the chromatin states and the regions of transcription factor binding derived from ChIP-seq experiments; (2) resources containing data on genomic loci, for which allele-specific transcription factor binding was revealed based on ChIP-seq technology; (3) resources containing in silico predicted data on the potential impact of genetic variants on the transcription factor binding sites.

Download Full-text

Integrative single-cell analysis by transcriptional and epigenetic states in human adult brain

10.1101/128520 ◽

2017 ◽

Cited By ~ 4

Author(s):

Blue B. Lake ◽

Song Chen ◽

Brandon C. Sos ◽

Jean Fan ◽

Yun Yung ◽

...

Keyword(s):

Human Brain ◽

Single Cell Analysis ◽

Single Cells ◽

Neuronal Cell ◽

Cell Types ◽

Regulatory Elements ◽

Molecular State ◽

Adult Brain ◽

Dna Accessibility ◽

Human Adult

AbstractDetailed characterization of the cell types comprising the highly complex human brain is essential to understanding its function. Such tasks require highly scalable experimental approaches to examine different aspects of the molecular state of individual cells, as well as the computational integration to produce unified cell state annotations. Here we report the development of two highly scalable methods (snDrop-Seq and scTHS-Seq), that we have used to acquire nuclear transcriptome and DNA accessibility maps for thousands of single cells from the human adult visual and frontal cortex. This has led to the best-resolved human neuronal subtypes to date, identification of a majority of the non-neuronal cell types, as well as the cell-type specific nuclear transcriptome and DNA accessibility maps. Integrative analysis allowed us to identify transcription factors and regulatory elements shaping the state of different brain cell types, and to map genetic risk factors of human brain common diseases to specific pathogenic cell types and subtypes.

Download Full-text

Shared activity patterns arising at genetic susceptibility loci reveal underlying genomic and cellular architecture of human disease

10.1101/095349 ◽

2016 ◽

Cited By ~ 2

Author(s):

J. Kenneth Baillie ◽

Andrew Bretherick ◽

Christopher S. Haley ◽

Sara Clohisey ◽

Alan Gray ◽

...

Keyword(s):

Transcriptional Regulation ◽

Fine Mapping ◽

Genetic Variants ◽

Complex Traits ◽

Expression Profiles ◽

Activity Patterns ◽

Cell Types ◽

Regulatory Elements ◽

Disease Pathogenesis ◽

Regulatory Regions

AbstractGenetic variants underlying complex traits, including disease susceptibility, are enriched within the transcriptional regulatory elements, promoters and enhancers. There is emerging evidence that regulatory elements associated with particular traits or diseases share patterns of transcriptional regulation. Accordingly, shared transcriptional regulation (coexpression) may help prioritise loci associated with a given trait, and help to identify the biological processes underlying it. Using cap analysis of gene expression (CAGE) profiles of promoter and enhancer-derived RNAs across 1824 human samples, we have quantified coexpression of RNAs originating from trait-associated regulatory regions using a novel analytical method (network density analysis; NDA). For most traits studied, sequence variants in regulatory regions were linked to tightly coexpressed networks that are likely to share important functional characteristics. These networks implicate particular cell types and tissues in disease pathogenesis; for example, variants associated with ulcerative colitis are linked to expression in gut tissue, whereas Crohn’s disease variants are restricted to immune cells. We show that this coexpression signal provides additional independent information for fine mapping likely causative variants. This approach identifies additional genetic variants associated with specific traits, including an association between the regulation of the OCT1 cation transporter and genetic variants underlying circulating cholesterol levels. This approach enables a deeper biological understanding of the causal basis of complex traits.ONE SENTENCE SUMMARYWe discover that variants associated with a specific disease share expression profiles across tissues and cell types, enabling fine mapping and identification of new disease-associated variants, illuminating key cell types involved in disease pathogenesis.

Download Full-text

Ovarian Cancer Risk Variants are Enriched in Histotype-Specific Enhancers that Disrupt Transcription Factor Binding Sites

10.1101/2020.02.21.960468 ◽

2020 ◽

Author(s):

Michelle R. Jones ◽

Pei-Chen Peng ◽

Simon G. Coetzee ◽

Jonathan Tyrer ◽

Alberto L. Reyes ◽

...

Keyword(s):

Ovarian Cancer ◽

Transcription Factor ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Cell Types ◽

Regulatory Elements ◽

P Value ◽

Factor Binding ◽

Risk Variants ◽

Risk Snps

AbstractQuantifying the functional effects of complex disease risk variants can provide insights into mechanisms underlying disease biology. Genome wide association studies (GWAS) have identified 39 regions associated with risk of epithelial ovarian cancer (EOC). The vast majority of these variants lie in the non-coding genome, suggesting they mediate their function through the regulation of gene expression by their interaction with tissue specific regulatory elements (REs). In this study, by intersecting germline genetic risk data with regulatory landscapes of active chromatin in ovarian cancers and their precursor cell types, we first estimated the heritability explained by known common low penetrance risk alleles. The narrow sense heritability of both EOC overall and high grade serous ovarian cancer (HGSOCs) was estimated to be 5-6%. Partitioned SNP-heritability across broad functional categories indicated a significant contribution of regulatory elements to EOC heritability. We collated epigenomic profiling data for 77 cell and tissue types from public resources (Roadmap Epigenomics and ENCODE), and H3K27Ac ChIP-Seq data generated in 26 ovarian cancer-relevant cell types. We identified significant enrichment of risk SNPs in active REs marked by H3K27Ac in HGSOCs. To further investigate how risk SNPs in active REs influence predisposition to ovarian cancer, we used motifbreakR to predict the disruption of transcription factor binding sites. We identified 469 candidate causal risk variants in H3K27Ac peaks that break TF motifs (enrichment P-Value < 1×10−5 compared to control variants). The most frequently broken motif was REST (P-Value = 0.0028), which has been reported as both a tumor suppressor and an oncogene. These systematic functional annotations with epigenomic data highlight the specificity of the regulatory landscape and demonstrate functional annotation of germline risk variants is most informative when performed in highly relevant cell types.

Download Full-text