Using evolutionary and structural information to predict DNA‐binding sites on DNA‐binding proteins

2006 ◽  
Vol 64 (1) ◽  
pp. 19-27 ◽  
Author(s):  
Igor B. Kuznetsov ◽  
Zhenkun Gou ◽  
Run Li ◽  
Seungwoo Hwang
Genes ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 394 ◽  
Author(s):  
Xiu-Juan Liu ◽  
Xiu-Jun Gong ◽  
Hua Yu ◽  
Jia-Hui Xu

Nowadays, various machine learning-based approaches using sequence information alone have been proposed for identifying DNA-binding proteins, which are crucial to many cellular processes, such as DNA replication, DNA repair and DNA modification. Among these methods, building a meaningful feature representation of the sequences and choosing an appropriate classifier are the most trivial tasks. Disclosing the significances and contributions of different feature spaces and classifiers to the final prediction is of the utmost importance, not only for the prediction performances, but also the practical clues of biological experiment designs. In this study, we propose a model stacking framework by orchestrating multi-view features and classifiers (MSFBinder) to investigate how to integrate and evaluate loosely-coupled models for predicting DNA-binding proteins. The framework integrates multi-view features including Local_DPP, 188D, Position-Specific Scoring Matrix (PSSM)_DWT and autocross-covariance of secondary structures(AC_Struc), which were extracted based on evolutionary information, sequence composition, physiochemical properties and predicted structural information, respectively. These features are fed into various loosely-coupled classifiers such as SVM and random forest. Then, a logistic regression model was applied to evaluate the contributions of these individual classifiers and to make the final prediction. When performing on the training dataset PDB1075, the proposed method achieves an accuracy of 83.53%. On the independent dataset PDB186, the method achieves an accuracy of 81.72%, which outperforms many existing methods. These results suggest that the framework is able to orchestrate various predicted models flexibly with good performances.


2007 ◽  
Vol 36 (1) ◽  
pp. e8-e8 ◽  
Author(s):  
Jue Zeng ◽  
Jizhou Yan ◽  
Ting Wang ◽  
Deborah Mosbrook-Davis ◽  
Kyle T. Dolan ◽  
...  

2021 ◽  
Author(s):  
Qianmu Yuan ◽  
Sheng Chen ◽  
Jiahua Rao ◽  
Shuangjia Zheng ◽  
Huiying Zhao ◽  
...  

AbstractMotivationProtein-DNA interactions play crucial roles in the biological systems, and identifying protein-DNA binding sites is the first step for mechanistic understanding of various biological activities (such as transcription and repair) and designing novel drugs. How to accurately identify DNA-binding residues from only protein sequence remains a challenging task. Currently, most existing sequence-based methods only consider contextual features of the sequential neighbors, which are limited to capture spatial information.ResultsBased on the recent breakthrough in protein structure prediction by AlphaFold2, we propose an accurate predictor, GraphSite, for identifying DNA-binding residues based on the structural models predicted by AlphaFold2. Here, we convert the binding site prediction problem into a graph node classification task and employ a transformerbased variant model to take the protein structural information into account. By leveraging predicted protein structures and graph transformer, GraphSite substantially improves over the latest sequence-based and structure-based methods. The algorithm was further confirmed on the independent test set of 196 proteins, where GraphSite surpasses the state-of-the-art structure-based method by 12.3% in AUPR and 9.3% in MCC, [email protected]


2017 ◽  
Vol 28 (3) ◽  
pp. 364-369 ◽  
Author(s):  
Jason Brickner

Eukaryotic genomes are spatially organized within the nucleus by chromosome folding, interchromosomal contacts, and interaction with nuclear structures. This spatial organization is observed in diverse organisms and both reflects and contributes to gene expression and differentiation. This leads to the notion that the arrangement of the genome within the nucleus has been shaped and conserved through evolutionary processes and likely plays an adaptive function. Both DNA-binding proteins and changes in chromatin structure influence the positioning of genes and larger domains within the nucleus. This suggests that the spatial organization of the genome can be genetically encoded by binding sites for DNA-binding proteins and can also involve changes in chromatin structure, potentially through nongenetic mechanisms. Here I briefly discuss the results that support these ideas and their implications for how genomes encode spatial organization.


2016 ◽  
Vol 113 (14) ◽  
pp. 3826-3831 ◽  
Author(s):  
Payal Ray ◽  
Sandip De ◽  
Apratim Mitra ◽  
Karel Bezstarosti ◽  
Jeroen A. A. Demmers ◽  
...  

Polycomb group (PcG) proteins are responsible for maintaining the silenced transcriptional state of many developmentally regulated genes. PcG proteins are organized into multiprotein complexes that are recruited to DNA via cis-acting elements known as “Polycomb response elements” (PREs). In Drosophila, PREs consist of binding sites for many different DNA-binding proteins, some known and others unknown. Identification of these DNA-binding proteins is crucial to understanding the mechanism of PcG recruitment to PREs. We report here the identification of Combgap (Cg), a sequence-specific DNA-binding protein that is involved in recruitment of PcG proteins. Cg can bind directly to PREs via GTGT motifs and colocalizes with the PcG proteins Pleiohomeotic (Pho) and Polyhomeotic (Ph) at the majority of PREs in the genome. In addition, Cg colocalizes with Ph at a number of targets independent of Pho. Loss of Cg leads to decreased recruitment of Ph at only a subset of sites; some of these sites are binding sites for other Polycomb repressive complex 1 (PRC1) components, others are not. Our data suggest that Cg can recruit Ph in the absence of PRC1 and illustrate the diversity and redundancy of PcG protein recruitment mechanisms.


2014 ◽  
Vol 289 (3) ◽  
pp. 489-499 ◽  
Author(s):  
Bi-Qing Li ◽  
Kai-Yan Feng ◽  
Juan Ding ◽  
Yu-Dong Cai

Sign in / Sign up

Export Citation Format

Share Document