scholarly journals VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees

2006 ◽  
Vol 34 (Web Server) ◽  
pp. W529-W533 ◽  
Author(s):  
J. Grau ◽  
I. Ben-Gal ◽  
S. Posch ◽  
I. Grosse
2018 ◽  
Author(s):  
Zhijian Li ◽  
Marcel H. Schulz ◽  
Martin Zenke ◽  
Ivan G. Costa

1AbstractTransposase-Accessible Chromatin (ATAC) followed by sequencing (ATAC-seq) is a simple and fast protocol for detection of open chromatin. However, computational footprinting in ATAC-seq, i.e. search for regions with depletion of cleavage events due to transcription factor binding sites, has been poorly explored so far. We propose HINT-ATAC, a footprinting method that addresses ATAC-seq specific protocol artifacts. HINT-ATAC uses a probabilistic framework based on Variable-order Markov models to learn the complex sequence cleavage preferences of the transposase enzyme. Moreover, we observed specific strand specific cleavage patterns around the binding sites of transcription factors, which are determined by local nucleosome architecture. HINT-ATAC explores local nucleosome architecture to significantly outperform competing footprinting methods in predicting transcription factor binding sites by ChIP-seq. HINT-ATAC is an open source software and available online at www.regulatory-genomics.org/hint


2005 ◽  
Vol 21 (11) ◽  
pp. 2657-2666 ◽  
Author(s):  
I. Ben-Gal ◽  
A. Shani ◽  
A. Gohr ◽  
J. Grau ◽  
S. Arviv ◽  
...  

2021 ◽  
Vol 11 (11) ◽  
pp. 5123
Author(s):  
Maiada M. Mahmoud ◽  
Nahla A. Belal ◽  
Aliaa Youssif

Transcription factors (TFs) are proteins that control the transcription of a gene from DNA to messenger RNA (mRNA). TFs bind to a specific DNA sequence called a binding site. Transcription factor binding sites have not yet been completely identified, and this is considered to be a challenge that could be approached computationally. This challenge is considered to be a classification problem in machine learning. In this paper, the prediction of transcription factor binding sites of SP1 on human chromosome1 is presented using different classification techniques, and a model using voting is proposed. The highest Area Under the Curve (AUC) achieved is 0.97 using K-Nearest Neighbors (KNN), and 0.95 using the proposed voting technique. However, the proposed voting technique is more efficient with noisy data. This study highlights the applicability of the voting technique for the prediction of binding sites, and highlights the outperformance of KNN on this type of data. The study also highlights the significance of using voting.


Sign in / Sign up

Export Citation Format

Share Document