scholarly journals Ens-PPI: A Novel Ensemble Classifier for Predicting the Interactions of Proteins Using Autocovariance Transformation from PSSM

2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Zhen-Guo Gao ◽  
Lei Wang ◽  
Shi-Xiong Xia ◽  
Zhu-Hong You ◽  
Xin Yan ◽  
...  

Protein-Protein Interactions (PPIs) play vital roles in most biological activities. Although the development of high-throughput biological technologies has generated considerable PPI data for various organisms, many problems are still far from being solved. A number of computational methods based on machine learning have been developed to facilitate the identification of novel PPIs. In this study, a novel predictor was designed using the Rotation Forest (RF) algorithm combined with Autocovariance (AC) features extracted from the Position-Specific Scoring Matrix (PSSM). More specifically, the PSSMs are generated using the information of protein amino acids sequence. Then, an effective sequence-based features representation, Autocovariance, is employed to extract features from PSSMs. Finally, the RF model is used as a classifier to distinguish between the interacting and noninteracting protein pairs. The proposed method achieves promising prediction performance when performed on the PPIs ofYeast,H.pylori, andindependent datasets. The good results show that the proposed model is suitable for PPIs prediction and could also provide a useful supplementary tool for solving other bioinformatics problems.

Author(s):  
Fatma-Elzahraa Eid ◽  
Haitham Elmarakeby ◽  
Yujia Alina Chan ◽  
Nadine Fornelos Martins ◽  
Mahmoud ElHefnawi ◽  
...  

AbstractRepresentational biases that are common in biological data can inflate prediction performance and confound our understanding of how and what machine learning (ML) models learn from large complicated datasets. However, auditing for these biases is not a common practice in ML in the life sciences. Here, we devise a systematic auditing framework and harness it to audit three different ML applications of significant therapeutic interest: prediction frameworks of protein-protein interactions, drug-target bioactivity, and MHC-peptide binding. Through this, we identify unrecognized biases that hinder the ML process and result in low model generalizability. Ultimately, we show that, when there is insufficient signal in the training data, ML models are likely to learn primarily from representational biases.


2021 ◽  
Vol 17 ◽  
pp. 117693432110500
Author(s):  
Jie Pan ◽  
Li-Ping Li ◽  
Chang-Qing Yu ◽  
Zhu-Hong You ◽  
Yong-Jian Guan ◽  
...  

Protein-protein interactions (PPIs) in plants are essential for understanding the regulation of biological processes. Although high-throughput technologies have been widely used to identify PPIs, they are usually laborious, expensive, and suffer from high false-positive rates. Therefore, it is imperative to develop novel computational approaches as a supplement tool to detect PPIs in plants. In this work, we presented a method, namely DST-RoF, to identify PPIs in plants by combining an ensemble learning classifier-Rotation Forest (RoF) with discrete sine transformation (DST). Specifically, plant protein sequence is firstly converted into Position-Specific Scoring Matrix (PSSM). Then, the discrete sine transformation was employed to extract effective features for obtaining the evolutionary information of proteins. Finally, these optimal features were fed into the RoF classifier for training and prediction. When performed on the plant datasets Arabidopsis, Rice, and Maize, DST-RoF yielded high prediction accuracy of 82.95%, 88.82%, and 93.70%, respectively. To further evaluate the prediction ability of our approach, we compared it with 4 state-of-the-art classifiers and 3 different feature extraction methods. Comprehensive experimental results anticipated that our method is feasible and robust for predicting potential plant-protein interacted pairs.


2019 ◽  
Vol 20 (3) ◽  
pp. 177-184 ◽  
Author(s):  
Nantao Zheng ◽  
Kairou Wang ◽  
Weihua Zhan ◽  
Lei Deng

Background:Targeting critical viral-host Protein-Protein Interactions (PPIs) has enormous application prospects for therapeutics. Using experimental methods to evaluate all possible virus-host PPIs is labor-intensive and time-consuming. Recent growth in computational identification of virus-host PPIs provides new opportunities for gaining biological insights, including applications in disease control. We provide an overview of recent computational approaches for studying virus-host PPI interactions.Methods:In this review, a variety of computational methods for virus-host PPIs prediction have been surveyed. These methods are categorized based on the features they utilize and different machine learning algorithms including classical and novel methods.Results:We describe the pivotal and representative features extracted from relevant sources of biological data, mainly include sequence signatures, known domain interactions, protein motifs and protein structure information. We focus on state-of-the-art machine learning algorithms that are used to build binary prediction models for the classification of virus-host protein pairs and discuss their abilities, weakness and future directions.Conclusion:The findings of this review confirm the importance of computational methods for finding the potential protein-protein interactions between virus and host. Although there has been significant progress in the prediction of virus-host PPIs in recent years, there is a lot of room for improvement in virus-host PPI prediction.


2006 ◽  
Vol 173 (4) ◽  
pp. 533-544 ◽  
Author(s):  
Chad D. Knights ◽  
Jason Catania ◽  
Simone Di Giovanni ◽  
Selen Muratoglu ◽  
Ricardo Perez ◽  
...  

The activity of the p53 gene product is regulated by a plethora of posttranslational modifications. An open question is whether such posttranslational changes act redundantly or dependently upon one another. We show that a functional interference between specific acetylated and phosphorylated residues of p53 influences cell fate. Acetylation of lysine 320 (K320) prevents phosphorylation of crucial serines in the NH2-terminal region of p53; only allows activation of genes containing high-affinity p53 binding sites, such as p21/WAF; and promotes cell survival after DNA damage. In contrast, acetylation of K373 leads to hyperphosphorylation of p53 NH2-terminal residues and enhances the interaction with promoters for which p53 possesses low DNA binding affinity, such as those contained in proapoptotic genes, leading to cell death. Further, acetylation of each of these two lysine clusters differentially regulates the interaction of p53 with coactivators and corepressors and produces distinct gene-expression profiles. By analogy with the “histone code” hypothesis, we propose that the multiple biological activities of p53 are orchestrated and deciphered by different “p53 cassettes,” each containing combination patterns of posttranslational modifications and protein–protein interactions.


2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Minghui Wang ◽  
Tao Wang ◽  
Binghua Wang ◽  
Yu Liu ◽  
Ao Li

Protein phosphorylation is catalyzed by kinases which regulate many aspects that control death, movement, and cell growth. Identification of the phosphorylation site-specific kinase-substrate relationships (ssKSRs) is important for understanding cellular dynamics and provides a fundamental basis for further disease-related research and drug design. Although several computational methods have been developed, most of these methods mainly use local sequence of phosphorylation sites and protein-protein interactions (PPIs) to construct the prediction model. While phosphorylation presents very complicated processes and is usually involved in various biological mechanisms, the aforementioned information is not sufficient for accurate prediction. In this study, we propose a new and powerful computational approach named KSRPred for ssKSRs prediction, by introducing a novel phosphorylation site-kinase network (pSKN) profiles that can efficiently incorporate the relationships between various protein kinases and phosphorylation sites. The experimental results show that the pSKN profiles can efficiently improve the prediction performance in collaboration with local sequence and PPI information. Furthermore, we compare our method with the existing ssKSRs prediction tools and the results demonstrate that KSRPred can significantly improve the prediction performance compared with existing tools.


2016 ◽  
Vol 14 (03) ◽  
pp. 1650011 ◽  
Author(s):  
Wajid Arshad Abbasi ◽  
Fayyaz Ul Amir Afsar Minhas

The study of interactions between host and pathogen proteins is important for understanding the underlying mechanisms of infectious diseases and for developing novel therapeutic solutions. Wet-lab techniques for detecting protein–protein interactions (PPIs) can benefit from computational predictions. Machine learning is one of the computational approaches that can assist biologists by predicting promising PPIs. A number of machine learning based methods for predicting host–pathogen interactions (HPI) have been proposed in the literature. The techniques used for assessing the accuracy of such predictors are of critical importance in this domain. In this paper, we question the effectiveness of K-fold cross-validation for estimating the generalization ability of HPI prediction for proteins with no known interactions. K-fold cross-validation does not model this scenario, and we demonstrate a sizable difference between its performance and the performance of an alternative evaluation scheme called leave one pathogen protein out (LOPO) cross-validation. LOPO is more effective in modeling the real world use of HPI predictors, specifically for cases in which no information about the interacting partners of a pathogen protein is available during training. We also point out that currently used metrics such as areas under the precision-recall or receiver operating characteristic curves are not intuitive to biologists and propose simpler and more directly interpretable metrics for this purpose.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Anchit Bijalwan

Botnet forensic analysis helps in understanding the nature of attacks and the modus operandi used by the attackers. Botnet attacks are difficult to trace because of their rapid pace, epidemic nature, and smaller size. Machine learning works as a panacea for botnet attack related issues. It not only facilitates detection but also helps in prevention from bot attack. The proposed inquisition model endeavors improved quality of results by comprehensive botnet detection and forensic analysis. This scenario has been applied in eight different combinations of ensemble classifier technique to detect botnet evidence. The study is also compared to the ensemble-based classifiers with the single classifier using different parameters. The results exhibit that the proposed model can improve accuracy over a single classifier.


Sign in / Sign up

Export Citation Format

Share Document