HOT or not: examining the basis of high-occupancy target regions

Mapping Intimacies ◽

10.1101/107680 ◽

2017 ◽

Cited By ~ 3

Author(s):

Katarzyna Wreczycka ◽

Vedran Franke ◽

Bora Uyar ◽

Ricardo Wurmus ◽

Altuna Akalin

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Binding Sites ◽

Cell Types ◽

Data Sets ◽

Gene Promoters ◽

Quadruplex Dna ◽

Golden Standard ◽

Multiple Cell ◽

Multiple Species

AbstractHigh-occupancy target (HOT) regions are the segments of the genome with unusually high number of transcription factor binding sites. These regions are observed in multiple species and thought to have biological importance due to high transcription factor occupancy. Furthermore, they coincide with house-keeping gene promoters and the associated genes are stably expressed across multiple cell types. Despite these features, HOT regions are solemnly defined using ChIP-seq experiments and shown to lack canonical motifs for transcription factors that are thought to be bound there. Although, ChIP-seq experiments are the golden standard for finding genome-wide binding sites of a protein, they are not noise free. Here, we show that HOT regions are likely to be ChIP-seq artifacts and they are similar to previously proposed “hyper-ChIPable” regions. Using ChIP-seq data sets for knocked-out transcription factors, we demonstrate presence of false positive signals on HOT regions. We observe sequence characteristics and genomic features that are discriminatory of HOT regions, such as GC/CpG-rich k-mers and enrichment of RNA-DNA hybrids (R-loops) and DNA tertiary structures (G-quadruplex DNA). The artificial ChIP-seq enrichment on HOT regions could be associated to these discriminatory features. Furthermore, we propose strategies to deal with such artifacts for the future ChIP-seq studies.

Download Full-text

Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome

10.1101/168419 ◽

2018 ◽

Cited By ~ 10

Author(s):

Mehran Karimzadeh ◽

Michael M. Hoffman

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Binding Sites ◽

Cell Types ◽

Transcription Factor Binding ◽

Regulatory Function ◽

Factor Binding ◽

Link Type ◽

Genomic Regions ◽

Factor Sequence

AbstractMotivationIdentifying transcription factor binding sites is the first step in pinpointing non-coding mutations that disrupt the regulatory function of transcription factors and promote disease. ChIP-seq is the most common method for identifying binding sites, but performing it on patient samples is hampered by the amount of available biological material and the cost of the experiment. Existing methods for computational prediction of regulatory elements primarily predict binding in genomic regions with sequence similarity to known transcription factor sequence preferences. This has limited efficacy since most binding sites do not resemble known transcription factor sequence motifs, and many transcription factors are not even sequence-specific.ResultsWe developed Virtual ChIP-seq, which predicts binding of individual transcription factors in new cell types using an artificial neural network that integrates ChIP-seq results from other cell types and chromatin accessibility data in the new cell type. Virtual ChIP-seq also uses learned associations between gene expression and transcription factor binding at specific genomic regions. This approach outperforms methods that predict TF binding solely based on sequence preference, pre-dicting binding for 36 transcription factors (Matthews correlation coefficient > 0.3).AvailabilityThe datasets we used for training and validation are available at https://virchip.hoffmanlab.org. We have deposited in Zenodo the current version of our software (http://doi.org/10.5281/zenodo.1066928), datasets (http://doi.org/10.5281/zenodo.823297), predictions for 36 transcription factors on Roadmap Epigenomics cell types (http://doi.org/10.5281/zenodo.1455759), and predictions in Cistrome as well as ENCODE-DREAM in vivo TF Binding Site Prediction Challenge (http://doi.org/10.5281/zenodo.1209308).

Download Full-text

CTCF: the protein, the binding partners, the binding sites and their chromatin loops

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2012.0369 ◽

2013 ◽

Vol 368 (1620) ◽

pp. 20120369 ◽

Cited By ~ 108

Author(s):

Sjoerd Johannes Bastiaan Holwerda ◽

Wouter de Laat

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Rna Polymerase Ii ◽

Binding Sites ◽

Ctcf Binding ◽

Chromatin Domain ◽

Gene Promoters ◽

Tissue Specific ◽

Chromatin Loops ◽

Exact Function

CTCF has it all. The transcription factor binds to tens of thousands of genomic sites, some tissue-specific, others ultra-conserved. It can act as a transcriptional activator, repressor and insulator, and it can pause transcription. CTCF binds at chromatin domain boundaries, at enhancers and gene promoters, and inside gene bodies. It can attract many other transcription factors to chromatin, including tissue-specific transcriptional activators, repressors, cohesin and RNA polymerase II, and it forms chromatin loops. Yet, or perhaps therefore, CTCF's exact function at a given genomic site is unpredictable. It appears to be determined by the associated transcription factors, by the location of the binding site relative to the transcriptional start site of a gene, and by the site's engagement in chromatin loops with other CTCF-binding sites, enhancers or gene promoters. Here, we will discuss genome-wide features of CTCF binding events, as well as locus-specific functions of this remarkable transcription factor.

Download Full-text

Positional specificity of different transcription factor classes within enhancers

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1804663115 ◽

2018 ◽

Vol 115 (30) ◽

pp. E7222-E7230 ◽

Cited By ~ 30

Author(s):

Sharon R. Grossman ◽

Jesse Engreitz ◽

John P. Ray ◽

Tung H. Nguyen ◽

Nir Hacohen ◽

...

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Transcription Factors ◽

Binding Sites ◽

Cell Types ◽

Regulatory Sequences ◽

Functional Characteristics ◽

Positional Specificity ◽

Chromatin Remodelers ◽

Positional Bias

Gene expression is controlled by sequence-specific transcription factors (TFs), which bind to regulatory sequences in DNA. TF binding occurs in nucleosome-depleted regions of DNA (NDRs), which generally encompass regions with lengths similar to those protected by nucleosomes. However, less is known about where within these regions specific TFs tend to be found. Here, we characterize the positional bias of inferred binding sites for 103 TFs within ∼500,000 NDRs across 47 cell types. We find that distinct classes of TFs display different binding preferences: Some tend to have binding sites toward the edges, some toward the center, and some at other positions within the NDR. These patterns are highly consistent across cell types, suggesting that they may reflect TF-specific intrinsic structural or functional characteristics. In particular, TF classes with binding sites at NDR edges are enriched for those known to interact with histones and chromatin remodelers, whereas TFs with central enrichment interact with other TFs and cofactors such as p300. Our results suggest distinct regiospecific binding patterns and functions of TF classes within enhancers.

Download Full-text

Genome-Wide Profiling of Epigenetic and Transcription Factor Regulation In Human Macrophage Differentiation

Blood ◽

10.1182/blood.v116.21.3875.3875 ◽

2010 ◽

Vol 116 (21) ◽

pp. 3875-3875

Author(s):

Thu-Hang Pham ◽

Monika Lichtinger ◽

Chris Benner ◽

Sabine Pape ◽

Lucia Schwarzfischer ◽

...

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Dna Binding ◽

Mrna Expression ◽

Binding Sites ◽

Cell Types ◽

Human Macrophage ◽

Macrophage Differentiation ◽

Human Macrophages ◽

Expression Levels

Abstract Abstract 3875 The differentiation of human macrophages is accompanied by distinctive phenotypical changes and generally proceeds in the absence of proliferation. The molecular events governing this process are still poorly understood. Using ChIP-Seq technology we studied epigenetic changes as well as alterations in transcription factor occupancy during human monocyte differentiation and correlated these events with gene expression levels in hematopoietic cell types. We show that putative enhancer regions marked by histone H3 lysine4 monomethylation (H3K4me1) at different developmental stages (human progenitor cells, peripheral blood monocytes and in vitro differentiated macrophages) are enriched in distinct sets of transcription factor motifs corresponding to lineage-determining factors. Cell stage-specific histone methylation at promoter-distal sites corresponds with increased mRNA expression levels of neighboring genes. We generated global DNA-binding maps in monocytes and macrophages for two transcription factors (PU.1 and C/EBPβ) with a well established role in monocyte/macrophage differentiation. Comparison of human binding sites with corresponding mouse data revealed a surprisingly low level of conservation (∼10-15%) of PU.1-or C/EBPβ -bound sites between man and mouse, despite a highly conserved binding preference for both transcription factors. During monocytic differentiation, human macrophages primarily gained additional binding sites for both transcription factors (as well as promoter-distal H3K4me1). Interestingly, only neighboring genes with multiple binding events showed significantly increased, macrophage-specific mRNA expression as compared to monocytic as well as lymphocytic cell types. Human macrophage-specific H3K4me1-marked regions as well as macrophage-specific PU.1- and C/EBP-bound sites were characterized by overlapping sets of novel sequence motifs, suggesting that the combinatorial interaction of corresponding DNA-binding factors with PU.1 and C/EBPβ may be required for the establishment of human macrophage-specific enhancers. These data provide novel insights into PU.1 and C/EBPβ mediated gene regulation during human macrophage differentiation. Disclosures: No relevant conflicts of interest to declare.

Download Full-text

Analysis of the FBXO7 promoter reveals overlapping Pax5 and c-Myb binding sites functioning in B cells

10.1101/792895 ◽

2019 ◽

Author(s):

Suzanne Randle ◽

Heike Laman

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

B Cells ◽

Blood Cell ◽

Binding Sites ◽

Cell Types ◽

Sequence Alignments ◽

Multiple Transcription Factor ◽

And Function ◽

Insight Into

AbstractFbxo7 is a key player in the differentiation and function of numerous blood cell types, and in neurons, oligodendrocytes and spermatocytes. In an effort to gain insight into the physiological and pathological settings where Fbxo7 is likely to play a key role, we sought to define the transcription factors which direct FBXO7 expression. Using sequence alignments across 28 species, we defined the human FBXO7 promoter and found that it contains two conserved regions enriched for multiple transcription factor binding sites. Many of these have roles in either neuronal or haematopoietic development. Using various FBXO7 promoter reporters, we found ELF4, Pax5 and c-Myb have functional binding sites that activate transcription. Overlap of Pax5 and c-Myb binding sites suggest that these factors bind cooperatively to transactivate the FBXO7 promoter. Although endogenous Pax5 is bound to the FBXO7 promoter in B cells, c-Myb is also required for FBXO7 expression. Our data suggest the interplay of multiple transcription factors regulate the FBXO7 promoter.

Download Full-text

Lozenge is expressed in pluripotent precursor cells and patterns multiple cell types in the Drosophila eye through the control of cell-specific transcription factors

Development ◽

10.1242/dev.125.18.3681 ◽

1998 ◽

Vol 125 (18) ◽

pp. 3681-3687 ◽

Cited By ~ 2

Author(s):

G.V. Flores ◽

A. Daga ◽

H.R. Kalhor ◽

U. Banerjee

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Cell Types ◽

Cell Fates ◽

Specific Expression ◽

Enhancer Element ◽

Eye Disc ◽

Drosophila Eye ◽

A Cell ◽

Multiple Cell

In the developing Drosophila eye, individual cell fates are specified when general signaling mechanisms are interpreted in the context of cell-specific transcription factors. Lozenge, a Runt/AML1/CBFA1-like transcription factor, determines the fates of a number of neuronal and non-neuronal cells by regulating the expression of multiple fate-determining transcription factors. The Lozenge protein is expressed in the nuclei of the cells that it patterns and also in their undifferentiated precursors. An enhancer element located within the second intron of the lozenge gene is responsible for its eye-specific expression. Lozenge is not itself a cell-specific transcription factor, rather it prepatterns the eye disc by positioning cell-specific factors in their appropriate locations.

Download Full-text

Persistent Interactions of Core Histone Tails with Nucleosomal DNA following Acetylation and Transcription Factor Binding

Molecular and Cellular Biology ◽

10.1128/mcb.18.11.6293 ◽

1998 ◽

Vol 18 (11) ◽

pp. 6293-6304 ◽

Cited By ~ 96

Author(s):

Vesco Mutskov ◽

Delphine Gerber ◽

Dimitri Angelov ◽

Juan Ausio ◽

Jerry Workman ◽

...

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Molecular Biology ◽

Binding Sites ◽

Cross Linking ◽

Uv Laser ◽

Histone Tail ◽

Histone Tails ◽

Nucleosomal Dna ◽

Core Histones

ABSTRACT In this study, we examined the effect of acetylation of the NH2 tails of core histones on their binding to nucleosomal DNA in the absence or presence of bound transcription factors. To do this, we used a novel UV laser-induced protein-DNA cross-linking technique, combined with immunochemical and molecular biology approaches. Nucleosomes containing one or five GAL4 binding sites were reconstituted with hypoacetylated or hyperacetylated core histones. Within these reconstituted particles, UV laser-induced histone-DNA cross-linking was found to occur only via the nonstructured histone tails and thus presented a unique tool for studying histone tail interactions with nucleosomal DNA. Importantly, these studies demonstrated that the NH2 tails were not released from nucleosomal DNA upon histone acetylation, although some weakening of their interactions was observed at elevated ionic strengths. Moreover, the binding of up to five GAL4-AH dimers to nucleosomes occupying the central 90 bp occurred without displacement of the histone NH2 tails from DNA. GAL4-AH binding perturbed the interaction of each histone tail with nucleosomal DNA to different degrees. However, in all cases, greater than 50% of the interactions between the histone tails and DNA was retained upon GAL4-AH binding, even if the tails were highly acetylated. These data illustrate an interaction of acetylated or nonacetylated histone tails with DNA that persists in the presence of simultaneously bound transcription factors.

Download Full-text

Specific transcription factors stimulate simian virus 40 and polyomavirus origins of DNA replication

Molecular and Cellular Biology ◽

10.1128/mcb.12.6.2514-2524.1992 ◽

1992 ◽

Vol 12 (6) ◽

pp. 2514-2524 ◽

Cited By ~ 1

Author(s):

Z S Guo ◽

M L DePamphilis

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Dna Replication ◽

Binding Sites ◽

Simian Virus 40 ◽

Simian Virus ◽

T Antigen ◽

Activation Domains ◽

Auxiliary Component ◽

Cell Extracts

The origins of DNA replication (ori) in simian virus 40 (SV40) and polyomavirus (Py) contain an auxiliary component (aux-2) composed of multiple transcription factor binding sites. To determine whether this component stimulated replication by binding specific transcription factors, aux-2 was replaced by synthetic oligonucleotides that bound a single transcription factor. Sp1 and T-antigen (T-ag) sites, which exist in the natural SV40 aux-2 sequence, provided approximately 75 and approximately 20%, respectively, of aux-2 activity when transfected into monkey cells. In cell extracts, only T-ag sites were active. AP1 binding sites could replace completely either SV40 or Py aux-2. Mutations that eliminated AP1 binding also eliminated AP1 stimulation of replication. Yeast GAL4 binding sites that strongly stimulated transcription in the presence of GAL4 proteins failed to stimulate SV40 DNA replication, although they did partially replace Py aux-2. Stimulation required the presence of proteins consisting of the GAL4 DNA binding domain fused to specific activation domains such as VP16 or c-Jun. These data demonstrate a clear role for transcription factors with specific activation domains in activating both SV40 and Py ori. However, no correlation was observed between the ability of specific proteins to stimulate promoter activity and their ability to stimulate origin activity. We propose that only transcription factors whose specific activation domains can interact with the T-ag initiation complex can stimulate SV40 and Py ori-core activity.

Download Full-text

TFutils: Data structures for transcription factor bioinformatics

F1000Research ◽

10.12688/f1000research.17976.1 ◽

2019 ◽

Vol 8 ◽

pp. 152

Author(s):

Benjamin J. Stubbs ◽

Shweta Gopaulakrishnan ◽

Kimberly Glass ◽

Nathalie Pochet ◽

Celine Everaert ◽

...

Keyword(s):

Transcription Factor ◽

Transcription Factors ◽

Data Structures ◽

Binding Sites ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Bioconductor Package ◽

Genome Wide ◽

Study Results ◽

Integrative Analyses

DNA transcription is intrinsically complex. Bioinformatic work with transcription factors (TFs) is complicated by a multiplicity of data resources and annotations. The Bioconductor package TFutils includes data structures and functions to enhance the precision and utility of integrative analyses that have components involving TFs. TFutils provides catalogs of human TFs from three reference sources (CISBP, HOCOMOCO, and GO), a catalog of TF targets derived from MSigDb, and multiple approaches to enumerating TF binding sites. Aspects of integration of TF binding patterns and genome-wide association study results are explored in examples.

Download Full-text

An Algorithm for Cellular Reprogramming

10.1101/162974 ◽

2017 ◽

Cited By ~ 1

Author(s):

Scott Ronquist ◽

Geoff Patterson ◽

Markus Brown ◽

Stephen Lindsly ◽

Haiming Chen ◽

...

Keyword(s):

Cell Cycle ◽

Transcription Factor ◽

Transcription Factors ◽

Cell Biology ◽

Human Fibroblasts ◽

Cell Types ◽

Approximate Model ◽

Cellular Reprogramming ◽

Biological Processes ◽

Moderate Complexity

AbstractThe day we understand the time evolution of subcellular elements at a level of detail comparable to physical systems governed by Newton’s laws of motion seems far away. Even so, quantitative approaches to cellular dynamics add to our understanding of cell biology, providing data-guided frameworks that allow us to develop better predictions about, and methods for, control over specific biological processes and system-wide cell behavior. In this paper, we describe an approach to optimizing the use of transcription factors (TFs) in the context of cellular reprogramming. We construct an approximate model for the natural evolution of a cell cycle synchronized population of human fibroblasts, based on data obtained by sampling the expression of 22,083 genes at several time points along the cell cycle. In order to arrive at a model of moderate complexity, we cluster gene expression based on the division of the genome into topologically associating domains (TADs) and then model the dynamics of the TAD expression levels. Based on this dynamical model and known bioinformatics, such as transcription factor binding sites (TFBS) and functions, we develop a methodology for identifying the top transcription factor candidates for a specific cellular reprogramming task. The approach used is based on a device commonly used in optimal control. Our data-guided methodology identifies a number of transcription factors previously validated for reprogramming and/or natural differentiation. Our findings highlight the immense potential of dynamical models, mathematics, and data-guided methodologies for improving strategies for control over biological processes.Significance StatementReprogramming the human genome toward any desirable state is within reach; application of select transcription factors drives cell types toward different lineages in many settings. We introduce the concept of data-guided control in building a universal algorithm for directly reprogramming any human cell type into any other type. Our algorithm is based on time series genome transcription and architecture data and known regulatory activities of transcription factors, with natural dimension reduction using genome architectural features. Our algorithm predicts known reprogramming factors, top candidates for new settings, and ideal timing for application of transcription factors. This framework can be used to develop strategies for tissue regeneration, cancer cell reprogramming, and control of dynamical systems beyond cell biology.

Download Full-text