Transcription factors recognize DNA shape without nucleotide recognition

Mapping Intimacies ◽

10.1101/143677 ◽

2017 ◽

Cited By ~ 4

Author(s):

Md. Abul Hassan Samee ◽

Benoit G. Bruneau ◽

Katherine S. Pollard

Keyword(s):

Transcription Factors ◽

Binding Sites ◽

De Novo ◽

Sequence Information ◽

Sequence Motif ◽

Sequence Motifs ◽

Shape Features ◽

Sequence Recognition ◽

Helical Twist ◽

Dna Shape

AbstractWe hypothesized that transcription factors (TFs) recognize DNA shape without nucleotide sequence recognition. Motivating an independent role for shape, many TF binding sites lack a sequence-motif, DNA shape adds specificity to sequence-motifs, and different sequences can encode similar shapes. We therefore asked if binding sites of a TF are enriched for specific patterns of DNA shape-features, e.g., helical twist. We developed ShapeMF, which discovers these shape-motifs de novo without taking sequence information into account. We find that most TFs assayed in ENCODE have shape-motifs and bind regulatory regions recognizing shape-motifs in the absence of sequence-motifs. When shape- and sequence-recognition co-occur, the two types of motifs can be overlapping, flanking, or separated by consistent spacing. Shape-motifs are prevalent in regions co-bound by multiple TFs. Finally, TFs with identical sequence motifs have different shape-motifs, explaining their binding at distinct locations. These results establish shape-motifs as drivers of TF-DNA recognition complementary to sequence-motifs.

Download Full-text

A De Novo Shape Motif Discovery Algorithm Reveals Preferences of Transcription Factors for DNA Shape Beyond Sequence Motifs

Cell Systems ◽

10.1016/j.cels.2018.12.001 ◽

2019 ◽

Vol 8 (1) ◽

pp. 27-42.e6 ◽

Cited By ~ 19

Author(s):

Md. Abul Hassan Samee ◽

Benoit G. Bruneau ◽

Katherine S. Pollard

Keyword(s):

Transcription Factors ◽

Motif Discovery ◽

De Novo ◽

Sequence Motifs ◽

Dna Shape ◽

Motif Discovery Algorithm

Download Full-text

Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro

Nucleic Acids Research ◽

10.1093/nar/gkz540 ◽

2019 ◽

Vol 47 (13) ◽

pp. 6632-6641 ◽

Cited By ~ 6

Author(s):

Soumitra Pal ◽

Jan Hoinka ◽

Teresa M Przytycka

Keyword(s):

Dna Binding ◽

Specific Binding ◽

Sequence Similarity ◽

Binding Motif ◽

Sequence Motif ◽

Sequence Motifs ◽

Shape Features ◽

Dna Shape

Abstract Understanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the following compelling question is yet to be considered: in the absence of any sequence similarity to the binding motif, can DNA shape still increase binding probability? To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF–DNA binding. Specifically, Co-SELECT leverages the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allows Co-SELECT to detect an evidence for the role of DNA shape features in TF binding. Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to bind to DNA molecules of the shape consistent with the motif specific binding. This provides the first direct evidence that shape features that accompany the preferred sequence motifs also bestow an advantage for weak, sequence non-specific binding.

Download Full-text

Contingency in the convergent evolution of a regulatory network: Dosage compensation in Drosophila

10.1101/488569 ◽

2018 ◽

Author(s):

Doris Bachtrog ◽

Chris Ellison

Keyword(s):

Transposable Element ◽

Dosage Compensation ◽

Sex Chromosomes ◽

Binding Sites ◽

Evolutionary Biology ◽

De Novo ◽

Binding Motif ◽

Sequence Motif ◽

X Chromosomes ◽

Msl Complex

The repeatability or predictability of evolution is a central question in evolutionary biology, and most often addressed in experimental evolution studies. Here, we infer how genetically heterogeneous natural systems acquire the same molecular changes, to address how genomic background affects adaptation in natural populations. In particular, we take advantage of independently formed neo-sex chromosomes in Drosophila species that have evolved dosage compensation by co-opting the dosage compensation (MSL) complex, to study the mutational paths that have led to the acquisition of 100s of novel binding sites for the MSL complex in different species. This complex recognizes a conserved 21-bp GA-rich sequence motif that is enriched on the X chromosome, and newly formed X chromosomes recruit the MSL complex by de novo acquisition of this binding motif. We identify recently formed sex chromosomes in the Drosophila repleta and robusta species groups by genome sequencing, and generate genomic occupancy maps of the MSL complex to infer the location of novel binding sites. We find that diverse mutational paths were utilized in each species to evolve 100s of de novo binding motifs along the neo-X, including expansions of microsatellites and transposable element insertions. However, the propensity to utilize a particular mutational path differs between independently formed X chromosomes, and appears to be contingent on genomic properties of that species, such as simple repeat or transposable element density. This establishes the “genomic environment” as an important determinant in predicting the outcome of evolutionary adaptations.

Download Full-text

TFBSshape: a motif database for DNA shape features of transcription factor binding sites

Nucleic Acids Research ◽

10.1093/nar/gkt1087 ◽

2013 ◽

Vol 42 (D1) ◽

pp. D148-D155 ◽

Cited By ~ 82

Author(s):

Lin Yang ◽

Tianyin Zhou ◽

Iris Dror ◽

Anthony Mathelier ◽

Wyeth W. Wasserman ◽

...

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

Shape Features ◽

Factor Binding ◽

Dna Shape ◽

Motif Database

Download Full-text

StoatyDive: Evaluation and Classification of Peak Profiles for Sequencing Data

10.1101/799114 ◽

2019 ◽

Cited By ~ 1

Author(s):

Florian Heyl ◽

Rolf Backofen

Keyword(s):

Quality Control ◽

Binding Sites ◽

High Throughput Sequencing ◽

Sequence Motif ◽

Sequence Motifs ◽

Sequencing Data ◽

Stem Loop ◽

Link Type ◽

Quality Control Tool ◽

Downstream Analysis

The prediction of binding sites (peak calling) is a common task in the data analysis of methods such as crosslinking or chromatin immunoprecipitation in combination with high-throughput sequencing (CLIP-Seq, ChIP-Seq). The predicted binding sites are often further analyzed to predict sequence motifs or structure patterns as an example. However, the obtained peak set can vary in their profile shapes because of the used peakcaller method, different binding domains of the protein, protocol biases, or other factors. Thus, a tool is missing that evaluates and classifies the predicted peaks based on their shapes. We hereby present StoatyDive, a tool that can be used to filter for specific peak profile shapes of sequencing data such as CLIP and ChIP. StoatyDive therefore fine tunes downstream analysis steps such as structure or sequence motif predictions and acts as a quality control.With StoatyDive we were able to classify distinct peak profile shapes from CLIP-seq data of the histone stem-loop-binding protein (SLBP). We show the potential of StoatyDive, as a quality control tool and as a filter to pick different shapes based on biological or methodical questions.StoatyDive is open source and freely available under GLP-3 at https://github.com/BackofenLab/StoatyDive and at bioconda https://anaconda.org/bioconda/stoatydive.

Download Full-text

Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro

10.1101/413922 ◽

2018 ◽

Cited By ~ 1

Author(s):

Soumitra Pal ◽

Jan Hoinka ◽

Teresa M. Przytycka

Keyword(s):

Dna Binding ◽

Sequence Similarity ◽

Binding Motif ◽

Sequence Motif ◽

Shape Features ◽

Specific Shape ◽

Dna Shape ◽

Specific Contribution

AbstractUnderstanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the question if in the absence of any sequence similarity to the binding motif, DNA shape can still increase probability of binding was yet to be addressed.To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF-DNA binding. Specifically, the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allowed us to detect evidence for the role of DNA shape features in TF binding.Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to weakly bind to DNA molecules enriched in specific shape features. Surprisingly, we also found that some properties of DNA shape contribute to promiscuous binding of all tested TF families. Strikingly, such promiscuously bound shapes correspond to the most frequent shape formed by the DNA. We propose that this promiscuous binding facilitates diffusing of TFs along the DNA molecule before it is locked in its binding site.

Download Full-text

TFBSshape: an expanded motif database for DNA shape features of transcription factor binding sites

Nucleic Acids Research ◽

10.1093/nar/gkz970 ◽

2019 ◽

Cited By ~ 4

Author(s):

Tsu-Pei Chiu ◽

Beibei Xin ◽

Nicholas Markarian ◽

Yingfei Wang ◽

Remo Rohs

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Shape Features ◽

Factor Binding ◽

Methylated Dna ◽

Dna Shape ◽

Motif Database

AbstractTFBSshape (https://tfbsshape.usc.edu) is a motif database for analyzing structural profiles of transcription factor binding sites (TFBSs). The main rationale for this database is to be able to derive mechanistic insights in protein–DNA readout modes from sequencing data without available structures. We extended the quantity and dimensionality of TFBSshape, from mostly in vitro to in vivo binding and from unmethylated to methylated DNA. This new release of TFBSshape improves its functionality and launches a responsive and user-friendly web interface for easy access to the data. The current expansion includes new entries from the most recent collections of transcription factors (TFs) from the JASPAR and UniPROBE databases, methylated TFBSs derived from in vitro high-throughput EpiSELEX-seq binding assays and in vivo methylated TFBSs from the MeDReaders database. TFBSshape content has increased to 2428 structural profiles for 1900 TFs from 39 different species. The structural profiles for each TFBS entry now include 13 shape features and minor groove electrostatic potential for standard DNA and four shape features for methylated DNA. We improved the flexibility and accuracy for the shape-based alignment of TFBSs and designed new tools to compare methylated and unmethylated structural profiles of TFs and methods to derive DNA shape-preserving nucleotide mutations in TFBSs.

Download Full-text

Genomic Regions Flanking E-Box Binding Sites Influence DNA Binding Specificity of bHLH Transcription Factors through DNA Shape

Cell Reports ◽

10.1016/j.celrep.2013.03.014 ◽

2013 ◽

Vol 3 (4) ◽

pp. 1093-1104 ◽

Cited By ~ 180

Author(s):

Raluca Gordân ◽

Ning Shen ◽

Iris Dror ◽

Tianyin Zhou ◽

John Horton ◽

...

Keyword(s):

Transcription Factors ◽

Dna Binding ◽

Binding Sites ◽

Binding Specificity ◽

Dna Binding Specificity ◽

E Box ◽

Genomic Regions ◽

Dna Shape ◽

Bhlh Transcription Factors

Download Full-text

Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data Across 27 Tissue Types

10.1101/252023 ◽

2018 ◽

Cited By ~ 4

Author(s):

Cory C. Funk ◽

Alex M. Casella ◽

Segun Jung ◽

Matthew A. Richards ◽

Alex Rodriguez ◽

...

Keyword(s):

Transcription Factors ◽

Human Genome ◽

Binding Sites ◽

Regulatory Networks ◽

Specific Binding ◽

Association Studies ◽

Genome Wide Association Studies ◽

Sequence Motifs ◽

Link Type ◽

Genome Wide

AbstractThere is intense interest in mapping the tissue-specific binding sites of transcription factors in the human genome to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting provides a means to predict genome-wide binding sites for hundreds of transcription factors (TFs) simultaneously. However, despite the public availability of DNase-seq data for hundreds of samples, there is neither a unified analytical workflow nor a publicly accessible database providing the locations of footprints across all available samples. Here, we implemented a workflow for uniform processing of footprints using two state-of-the-art footprinting algorithms: Wellington and HINT. Our workflow scans the footprints generated by these algorithms for 1,530 sequence motifs to predict binding sites for 1,515 human transcription factors. We applied our workflow to detect footprints in 192 DNase-seq experiments from ENCODE spanning 27 human tissues. This collection of footprints describes an expansive landscape of potential TF occupancy. At thresholds optimized through machine learning, we report high-quality footprints covering 9.8% of the human genome. These footprints were enriched for true positive TF binding sites as defined by ChIP-seq peaks, as well as for genetic variants associated with changes in gene expression. Integrating our footprint atlas with summary statistics from genome-wide association studies revealed that risk for neuropsychiatric traits was enriched specifically at highly-scoring footprints in human brain, while risk for immune traits was enriched specifically at highly-scoring footprints in human lymphoblasts. Our cloud-based workflow is available at github.com/globusgenomics/genomics-footprint and a database with all footprints and TF binding site predictions are publicly available at http://data.nemoarchive.org/other/grant/sament/sament/footprint_atlas.

Download Full-text

A New Method Combining DNA Shape Features to Improve the Prediction Accuracy of Transcription Factor Binding Sites

Intelligent Computing Theories and Application - Lecture Notes in Computer Science ◽

10.1007/978-3-030-60802-6_8 ◽

2020 ◽

pp. 79-89

Author(s):

Siguo Wang ◽

Zhen Shen ◽

Ying He ◽

Qinhu Zhang ◽

Changan Yuan ◽

...

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Prediction Accuracy ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

New Method ◽

Shape Features ◽

Factor Binding ◽

Dna Shape

Download Full-text