scholarly journals Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes

2001 ◽  
Vol 29 (3) ◽  
pp. 774-782 ◽  
Author(s):  
L. A. McCue
2007 ◽  
Vol 05 (01) ◽  
pp. 105-116 ◽  
Author(s):  
MARKUS T. FRIBERG

We present an algorithm for predicting transcription factor binding sites based on ChIP-chip and phylogenetic footprinting data. Our algorithm is robust against low promoter sequence similarity and motif rearrangements, because it does not depend on multiple sequence alignments. This, in turn, allows us to incorporate information from more distant species. Representative random data sets are used to estimate the score significance. Our algorithm is fully automatic, and does not require human intervention. On a recent S. cerevisiae data set, it achieves higher accuracy than the previously best algorithms. Adaptive ChIP-chip threshold and the modular positional bias score are two general features of our algorithm that increase motif prediction accuracy and could be implemented in other algorithms as well. In addition, since our algorithm works partly orthogonally to other algorithms, combining several algorithms can increase prediction accuracy even further. Specifically, our method finds 6 motifs not found by the 2nd best algorithm.


2021 ◽  
Vol 11 (11) ◽  
pp. 5123
Author(s):  
Maiada M. Mahmoud ◽  
Nahla A. Belal ◽  
Aliaa Youssif

Transcription factors (TFs) are proteins that control the transcription of a gene from DNA to messenger RNA (mRNA). TFs bind to a specific DNA sequence called a binding site. Transcription factor binding sites have not yet been completely identified, and this is considered to be a challenge that could be approached computationally. This challenge is considered to be a classification problem in machine learning. In this paper, the prediction of transcription factor binding sites of SP1 on human chromosome1 is presented using different classification techniques, and a model using voting is proposed. The highest Area Under the Curve (AUC) achieved is 0.97 using K-Nearest Neighbors (KNN), and 0.95 using the proposed voting technique. However, the proposed voting technique is more efficient with noisy data. This study highlights the applicability of the voting technique for the prediction of binding sites, and highlights the outperformance of KNN on this type of data. The study also highlights the significance of using voting.


Sign in / Sign up

Export Citation Format

Share Document