Prediction of protein-protein interactions from amino acid sequences using extreme learning machine combined with auto covariance descriptor

Abstract Background：Protein–protein interactions (PPIs) are involved in a number of cellular processes and play a key role inside cells. The prediction of PPIs is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. Given that high-throughput methods are expensive and time-consuming, it is a challenging task to develop efficient and accurate computational methods for predicting PPIs .Results：In the study, a novel computational approach named WELM-SURF was developed to predict PPIs. The proposed method used Position Specific Scoring Matrix (PSSM) to capture protein evolutionary information and employed Speed Up Robot Features (SURF) to extract key features from PSSM of protein sequence. Weighted Extreme Learning Machine (WELM) is featured with short training time and great ability to execute classification efficiently by optimizing the loss function of weight matrix. Therefore, WELM classifier was used to carry out classification. The cross-validation results show that WELM-SURF obtains 97.36% and 95.12% of average accuracy on yeast and human dataset, respectively. The prediction ability of WELM-SURF was also compared with those of ELM-SRUF, SVM-SURF and other existing approaches. The comparison results further verify that WELM-SURF is obviously better than other methods.Conclusion：The experimental results proved that the WELM-SURF method is very useful for predicting PPIs and can also be applied to other bioinformatics studies of protein.

Download Full-text

Detection of Protein-Protein Interactions from Amino Acid Sequences Using a Rotation Forest Model with a Novel PR-LPQ Descriptor

Lecture Notes in Computer Science - Advanced Intelligent Computing Theories and Applications ◽

10.1007/978-3-319-22053-6_75 ◽

2015 ◽

pp. 713-720 ◽

Cited By ~ 13

Author(s):

Leon Wong ◽

Zhu-Hong You ◽

Shuai Li ◽

Yu-An Huang ◽

Gang Liu

Keyword(s):

Amino Acid ◽

Protein Interactions ◽

Amino Acid Sequences ◽

Protein Protein Interactions ◽

Rotation Forest ◽

Forest Model

Download Full-text

Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis

BMC Bioinformatics ◽

10.1186/1471-2105-14-s8-s10 ◽

2013 ◽

Vol 14 (S8) ◽

Cited By ~ 140

Author(s):

Zhu-Hong You ◽

Ying-Ke Lei ◽

Lin Zhu ◽

Junfeng Xia ◽

Bing Wang

Keyword(s):

Principal Component Analysis ◽

Amino Acid ◽

Protein Interactions ◽

Principal Component ◽

Component Analysis ◽

Amino Acid Sequences ◽

Extreme Learning Machines ◽

Protein Protein Interactions ◽

Learning Machines

Download Full-text

Protein–Protein Interactions Prediction via Multimodal Deep Polynomial Network and Regularized Extreme Learning Machine

IEEE Journal of Biomedical and Health Informatics ◽

10.1109/jbhi.2018.2845866 ◽

2019 ◽

Vol 23 (3) ◽

pp. 1290-1303 ◽

Cited By ~ 9

Author(s):

Haijun Lei ◽

Yuting Wen ◽

Zhuhong You ◽

Ahmed Elazab ◽

Ee-Leng Tan ◽

...

Keyword(s):

Extreme Learning Machine ◽

Protein Interactions ◽

Protein Protein Interactions ◽

Learning Machine

Download Full-text

Using Weighted Extreme Learning Machine Combined with Scale-invariant Feature Transform to Predict Protein-Protein Interactions from Protein Evolutionary Information

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2020.2965919 ◽

2020 ◽

pp. 1-1 ◽

Cited By ~ 1

Author(s):

Jianqiang Li ◽

Xiaofeng Shi ◽

Zhu-Hong You ◽

Hai-Cheng Yi ◽

Zhuangzhuang Chen ◽

...

Keyword(s):

Extreme Learning Machine ◽

Protein Interactions ◽

Evolutionary Information ◽

Protein Protein Interactions ◽

Scale Invariant ◽

Invariant Feature ◽

Weighted Extreme Learning Machine ◽

Feature Transform ◽

Learning Machine ◽

Scale Invariant Feature

Download Full-text

Using Weighted Extreme Learning Machine Combined with Scale-Invariant Feature Transform to Predict Protein-Protein Interactions from Protein Evolutionary Information

Intelligent Computing Theories and Application - Lecture Notes in Computer Science ◽

10.1007/978-3-319-95930-6_49 ◽

2018 ◽

pp. 527-532 ◽

Cited By ~ 3

Author(s):

Jianqiang Li ◽

Xiaofeng Shi ◽

Zhuhong You ◽

Zhuangzhuang Chen ◽

Qiuzhen Lin ◽

...

Keyword(s):

Extreme Learning Machine ◽

Protein Interactions ◽

Evolutionary Information ◽

Protein Protein Interactions ◽

Scale Invariant ◽

Invariant Feature ◽

Weighted Extreme Learning Machine ◽

Feature Transform ◽

Learning Machine ◽

Scale Invariant Feature

Download Full-text

Using Deep Neural Networks to Improve the Performance of Protein–Protein Interactions Prediction

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420520126 ◽

2020 ◽

Vol 34 (13) ◽

pp. 2052012 ◽

Cited By ~ 1

Author(s):

Yuan-Miao Gui ◽

Ru-Jing Wang ◽

Xue Wang ◽

Yuan-Yuan Wei

Keyword(s):

Neural Networks ◽

Protein Interactions ◽

Deep Neural Networks ◽

Molecular Mechanisms ◽

Extraction Methods ◽

Amino Acid Sequences ◽

Prediction Methods ◽

Protein Protein Interactions ◽

Local Descriptor ◽

Auto Covariance

Protein–protein interactions (PPIs) help to elucidate the molecular mechanisms of life activities and have a certain role in promoting disease treatment and new drug development. With the advent of the proteomics era, some PPIs prediction methods have emerged. However, the performances of these PPIs prediction methods still need to be optimized and improved. In order to optimize the performance of the PPIs prediction methods, we used the dropout method to reduce over-fitting by deep neural networks (DNNs), and combined with three types of feature extraction methods, conjoint triad (CT), auto covariance (AC) and local descriptor (LD), to build DNN models based on amino acid sequences. The results showed that the accuracy of the CT, AC and LD increased from 97.11% to 98.12%, 96.84% to 98.17%, and 95.30% to 95.60%, respectively. The loss values of the CT, AC and LD decreased from 27.47% to 14.96%, 65.91% to 17.82% and 36.23% to 15.34%, respectively. Experimental results show that dropout can optimize the performances of the DNN models. The results can provide a resource for scholars in future studies involving the prediction of PPIs. The experimental code is available at https://github.com/smalltalkman/hppi-tensorflow .

Download Full-text

Maximum margin classifier working in a set of strings

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2015.0551 ◽

2016 ◽

Vol 472 (2187) ◽

pp. 20150551 ◽

Cited By ~ 1

Author(s):

Hitoshi Koyano ◽

Morihiro Hayashida ◽

Tatsuya Akutsu

Keyword(s):

Probability Theory ◽

Protein Interactions ◽

Consensus Sequence ◽

Classification Problem ◽

Amino Acid Sequences ◽

Support Vector ◽

Generalization Error ◽

Protein Protein Interactions ◽

String Kernels ◽

Learning Machine

Numbers and numerical vectors account for a large portion of data. However, recently, the amount of string data generated has increased dramatically. Consequently, classifying string data is a common problem in many fields. The most widely used approach to this problem is to convert strings into numerical vectors using string kernels and subsequently apply a support vector machine that works in a numerical vector space. However, this non-one-to-one conversion involves a loss of information and makes it impossible to evaluate, using probability theory, the generalization error of a learning machine, considering that the given data to train and test the machine are strings generated according to probability laws. In this study, we approach this classification problem by constructing a classifier that works in a set of strings. To evaluate the generalization error of such a classifier theoretically, probability theory for strings is required. Therefore, we first extend a limit theorem for a consensus sequence of strings demonstrated by one of the authors and co-workers in a previous study. Using the obtained result, we then demonstrate that our learning machine classifies strings in an asymptotically optimal manner. Furthermore, we demonstrate the usefulness of our machine in practical data analysis by applying it to predicting protein–protein interactions using amino acid sequences and classifying RNAs by the secondary structure using nucleotide sequences.

Download Full-text