Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information

Computational and Mathematical Methods in Medicine ◽

10.1155/2013/524502 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 6

Author(s):

Xin Ma ◽

Jiansheng Wu ◽

Xiaoyun Xue

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Query Protein ◽

Dna Binding Proteins ◽

Evolutionary Information ◽

Support Vector ◽

Sequence Information ◽

Novel Approach ◽

Matthew’S Correlation Coefficient

DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed by support vector machine-sequential minimal optimization (SVM-SMO) algorithm in conjunction with a hybrid feature. The hybrid feature is incorporating evolutionary information feature, physicochemical property feature, and two novel attributes. These two attributes use DNA-binding residues and nonbinding residues in a query protein to obtain DNA-binding propensity and nonbinding propensity. The results demonstrate that our SVM-SMO model achieves 0.67 Matthew's correlation coefficient (MCC) and 89.6% overall accuracy with 88.4% sensitivity and 90.8% specificity, respectively. Performance comparisons on various features indicate that two novel attributes contribute to the performance improvement. In addition, our SVM-SMO model achieves the best performance than state-of-the-art methods on independent test dataset.

Download Full-text

Identification of DNA-Binding Proteins by Multiple Kernel Support Vector Machine and Sequence Information

Current Proteomics ◽

10.2174/1570164616666190417100509 ◽

2020 ◽

Vol 17 (4) ◽

pp. 302-310

Author(s):

Yijie Ding ◽

Feng Chen ◽

Xiaoyi Guo ◽

Jijun Tang ◽

Hongjie Wu

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Computational Method ◽

Support Vector ◽

Sequence Information ◽

Data Sets ◽

Multiple Kernel ◽

Kernel Support Vector Machine

Background: The DNA-binding proteins is an important process in multiple biomolecular functions. However, the tradition experimental methods for DNA-binding proteins identification are still time consuming and extremely expensive. Objective: In past several years, various computational methods have been developed to detect DNAbinding proteins. However, most of them do not integrate multiple information. Methods: In this study, we propose a novel computational method to predict DNA-binding proteins by two steps Multiple Kernel Support Vector Machine (MK-SVM) and sequence information. Firstly, we extract several feature and construct multiple kernels. Then, multiple kernels are linear combined by Multiple Kernel Learning (MKL). At last, a final SVM model, constructed by combined kernel, is built to predict DNA-binding proteins. Results: The proposed method is tested on two benchmark data sets. Compared with other existing method, our approach is comparable, even better than other methods on some data sets. Conclusion: We can conclude that MK-SVM is more suitable than common SVM, as the classifier for DNA-binding proteins identification.

Download Full-text

A Model Stacking Framework for Identifying DNA Binding Proteins by Orchestrating Multi-View Features and Classifiers

Genes ◽

10.3390/genes9080394 ◽

2018 ◽

Vol 9 (8) ◽

pp. 394 ◽

Cited By ~ 9

Author(s):

Xiu-Juan Liu ◽

Xiu-Jun Gong ◽

Hua Yu ◽

Jia-Hui Xu

Keyword(s):

Dna Binding ◽

Binding Proteins ◽

Structural Information ◽

Dna Binding Proteins ◽

Feature Representation ◽

Training Dataset ◽

Evolutionary Information ◽

Sequence Information ◽

Coupled Models ◽

Loosely Coupled

Nowadays, various machine learning-based approaches using sequence information alone have been proposed for identifying DNA-binding proteins, which are crucial to many cellular processes, such as DNA replication, DNA repair and DNA modification. Among these methods, building a meaningful feature representation of the sequences and choosing an appropriate classifier are the most trivial tasks. Disclosing the significances and contributions of different feature spaces and classifiers to the final prediction is of the utmost importance, not only for the prediction performances, but also the practical clues of biological experiment designs. In this study, we propose a model stacking framework by orchestrating multi-view features and classifiers (MSFBinder) to investigate how to integrate and evaluate loosely-coupled models for predicting DNA-binding proteins. The framework integrates multi-view features including Local_DPP, 188D, Position-Specific Scoring Matrix (PSSM)_DWT and autocross-covariance of secondary structures(AC_Struc), which were extracted based on evolutionary information, sequence composition, physiochemical properties and predicted structural information, respectively. These features are fed into various loosely-coupled classifiers such as SVM and random forest. Then, a logistic regression model was applied to evaluate the contributions of these individual classifiers and to make the final prediction. When performing on the training dataset PDB1075, the proposed method achieves an accuracy of 83.53%. On the independent dataset PDB186, the method achieves an accuracy of 81.72%, which outperforms many existing methods. These results suggest that the framework is able to orchestrate various predicted models flexibly with good performances.

Download Full-text

RF‐SVM : Identification of DNA ‐binding proteins based on comprehensive feature representation methods and support vector machine

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.26229 ◽

2021 ◽

Author(s):

Yanping Zhang ◽

Jianwei Ni ◽

Ya Gao

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Feature Representation ◽

Support Vector

Download Full-text

SUPPORT VECTOR MACHINE CLASSIFICATION OF PHYSICAL AND BIOLOGICAL DATASETS

International Journal of Modern Physics C ◽

10.1142/s0129183103004759 ◽

2003 ◽

Vol 14 (05) ◽

pp. 575-585 ◽

Cited By ~ 39

Author(s):

CONG-ZHONG CAI ◽

WAN-LU WANG ◽

YU-ZONG CHEN

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Protein Interactions ◽

Binding Proteins ◽

Nearest Neighbor ◽

Dna Binding Proteins ◽

Support Vector ◽

Testing Accuracy ◽

Better Than

The support vector machine (SVM) is used in the classification of sonar signals and DNA-binding proteins. Our study on the classification of sonar signals shows that SVM produces a result better than that obtained from other classification methods, which is consistent from the findings of other studies. The testing accuracy of classification is 95.19% as compared with that of 90.4% from multilayered neural network and that of 82.7% from nearest neighbor classifier. From our results on the classification of DNA-binding proteins, one finds that SVM gives a testing accuracy of 82.32%, which is slightly better than that obtained from an earlier study of SVM classification of protein–protein interactions. Hence, our study indicates the usefulness of SVM in the identification of DNA-binding proteins. Further improvements in SVM algorithm and parameters are suggested.

Download Full-text

grDNA-Prot: The Prediction of DNA-Binding Proteins Based on Physicochemical Properties of Amino Acids and Support Vector Machine

Hans Journal of Computational Biology ◽

10.12677/hjcb.2021.111001 ◽

2021 ◽

Vol 11 (01) ◽

pp. 1-11

Author(s):

艳萍张

Keyword(s):

Amino Acids ◽

Support Vector Machine ◽

Dna Binding ◽

Physicochemical Properties ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector

Download Full-text

Extracting Sequence Features to Predict DNA-Binding Proteins Using Support Vector Machine

2013 International Conference on Computational and Information Sciences ◽

10.1109/iccis.2013.48 ◽

2013 ◽

Cited By ~ 2

Author(s):

Xin Ma ◽

Lefu Hu

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Sequence Features

Download Full-text

Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation

BMC Systems Biology ◽

10.1186/1752-0509-9-s1-s10 ◽

2015 ◽

Vol 9 (Suppl 1) ◽

pp. S10 ◽

Cited By ~ 42

Author(s):

Ruifeng Xu ◽

Jiyun Zhou ◽

Hongpeng Wang ◽

Yulan He ◽

Xiaolong Wang ◽

...

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Distance Transformation

Download Full-text

Use Chou’s 5-Step Rule to Predict DNA-Binding Proteins with Evolutionary Information

BioMed Research International ◽

10.1155/2020/6984045 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Weizhong Lu ◽

Zhengwei Song ◽

Yijie Ding ◽

Hongjie Wu ◽

Yan Cao ◽

...

Keyword(s):

Machine Learning ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Evolutionary Information ◽

Support Vector ◽

Machine Method ◽

Machine Learning Model ◽

Evolutionary Features ◽

One Machine

The knowledge of DNA-binding proteins would help to understand the functions of proteins better in cellular biological processes. Research on the prediction of DNA-binding proteins can promote the research of drug proteins and computer acidified drugs. In recent years, methods based on machine learning are usually used to predict proteins. Although great predicted performance can be achieved via current methods, researchers still need to invest more research in terms of the improvement of predicted performance. In this study, the prediction of DNA-binding proteins is studied from the perspective of evolutionary information and the support vector machine method. One machine learning model for predicting DNA-binding proteins based on evolutionary features by using Chou’s 5-step rule is put forward. The results show that great predicted performance is obtained on benchmark dataset PDB1075 and independent dataset PDB186, achieving the accuracy of 86.05% and 75.30%, respectively. Thus, the method proposed is comparable to a certain degree, and it may work even better than other methods to some extent.

Download Full-text

newDNA-Prot: Prediction of DNA-binding proteins by employing support vector machine and a comprehensive sequence representation

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2014.09.002 ◽

2014 ◽

Vol 52 ◽

pp. 51-59 ◽

Cited By ~ 14

Author(s):

Yanping Zhang ◽

Jun Xu ◽

Wei Zheng ◽

Chen Zhang ◽

Xingye Qiu ◽

...

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Sequence Representation

Download Full-text

gDNA-Prot: Predict DNA-binding proteins by employing support vector machine and a novel numerical characterization of protein sequence

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2016.06.002 ◽

2016 ◽

Vol 406 ◽

pp. 8-16 ◽

Cited By ~ 2

Author(s):

Yan-ping Zhang ◽

Wuyunqiqige ◽

Wei Zheng ◽

Shuyi Liu ◽

Chunguang Zhao

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Protein Sequence ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Numerical Characterization

Download Full-text