Protein Remote Homology Detection Based on an Ensemble Learning Approach

BioMed Research International ◽

10.1155/2016/5813645 ◽

2016 ◽

Vol 2016 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Junjie Chen ◽

Bingquan Liu ◽

Dong Huang

Keyword(s):

State Of The Art ◽

Predictive Performance ◽

Ensemble Classifier ◽

Homology Detection ◽

Weighted Voting ◽

Remote Homology ◽

Sequence Composition ◽

Feature Spaces ◽

Voting Strategy ◽

Remote Homology Detection

Protein remote homology detection is one of the central problems in bioinformatics. Although some computational methods have been proposed, the problem is still far from being solved. In this paper, an ensemble classifier for protein remote homology detection, called SVM-Ensemble, was proposed with a weighted voting strategy. SVM-Ensemble combined three basic classifiers based on different feature spaces, including Kmer, ACC, and SC-PseAAC. These features consider the characteristics of proteins from various perspectives, incorporating both the sequence composition and the sequence-order information along the protein sequences. Experimental results on a widely used benchmark dataset showed that the proposed SVM-Ensemble can obviously improve the predictive performance for the protein remote homology detection. Moreover, it achieved the best performance and outperformed other state-of-the-art methods.

Download Full-text

Protein Remote Homology Detection by Combining Pseudo Dimer Composition with an Ensemble Learning Method

Current Proteomics ◽

10.2174/157016461302160514002939 ◽

2016 ◽

Vol 13 (2) ◽

pp. 86-91 ◽

Cited By ~ 7

Author(s):

Bin Liu ◽

Junjie Chen ◽

Shanyi Wang

Keyword(s):

Ensemble Learning ◽

Learning Method ◽

Homology Detection ◽

Remote Homology ◽

Remote Homology Detection

Download Full-text

A discriminative method for protein remote homology detection based on N-Gram

Genetics and Molecular Research ◽

10.4238/2015.january.15.9 ◽

2015 ◽

Vol 14 (1) ◽

pp. 69-78 ◽

Cited By ~ 2

Author(s):

S. Xie ◽

P. Li ◽

Y. Jiang ◽

Y. Zhao

Keyword(s):

Homology Detection ◽

Remote Homology ◽

Discriminative Method ◽

N Gram ◽

Remote Homology Detection

Download Full-text

Using Chou’s Pseudo Amino Acid Composition for Protein Remote Homology Detection

Engineering ◽

10.4236/eng.2013.510b032 ◽

2013 ◽

Vol 05 (10) ◽

pp. 149-153 ◽

Cited By ~ 1

Author(s):

Bin Liu ◽

Xiaolong Wang

Keyword(s):

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Homology Detection ◽

Pseudo Amino Acid Composition ◽

Remote Homology ◽

Remote Homology Detection

Download Full-text

SPSO: Synthetic Protein Sequence Oversampling for Imbalanced Protein Data and Remote Homology Detection

Biological and Medical Data Analysis - Lecture Notes in Computer Science ◽

10.1007/11946465_10 ◽

2006 ◽

pp. 104-115 ◽

Cited By ~ 2

Author(s):

Majid Beigi ◽

Andreas Zell

Keyword(s):

Protein Sequence ◽

Homology Detection ◽

Remote Homology ◽

Synthetic Protein ◽

Remote Homology Detection

Download Full-text

ProtDet-CCH: Protein Remote Homology Detection by Combining Long Short-Term Memory and Ranking Methods

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2018.2789880 ◽

2019 ◽

Vol 16 (4) ◽

pp. 1203-1210 ◽

Cited By ~ 16

Author(s):

Bin Liu ◽

Shumin Li

Keyword(s):

Short Term Memory ◽

Homology Detection ◽

Ranking Methods ◽

Short Term ◽

Term Memory ◽

Remote Homology ◽

Long Short Term Memory ◽

Remote Homology Detection

Download Full-text

Analysis of Sequence Divergence in Metabolic Proteins of Plasmodium falciparum: Implications for Remote Homology Detection

Frontiers in Protein and Peptide Sciences ◽

10.2174/9781608058624114010014 ◽

2014 ◽

pp. 226-272

Keyword(s):

Plasmodium Falciparum ◽

Sequence Divergence ◽

Homology Detection ◽

Remote Homology ◽

Remote Homology Detection

Download Full-text

Remote homology detection using GA and NSGA-II on physicochemical properties

International Journal of Computer Applications in Technology ◽

10.1504/ijcat.2020.10034808 ◽

2020 ◽

Vol 64 (4) ◽

pp. 393

Author(s):

Mukti Routray ◽

Niranjan Kumar Ray

Keyword(s):

Physicochemical Properties ◽

Homology Detection ◽

Nsga Ii ◽

Remote Homology ◽

Remote Homology Detection

Download Full-text

Influence of Genomic and Other Biological Data Sets in the Understanding of Protein Structures, Functions and Interactions

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/jkdb.2011010102 ◽

2011 ◽

Vol 2 (1) ◽

pp. 24-44

Author(s):

N. Srinivasan ◽

G. Agarwal ◽

R. M. Bhaskara ◽

R. Gadkari ◽

O. Krishnadev ◽

...

Keyword(s):

Protein Structures ◽

Biological Properties ◽

Biological Data ◽

Biological Information ◽

Biological Databases ◽

Homology Detection ◽

Putative Gene ◽

Remote Homology ◽

Strategic Integration ◽

Remote Homology Detection

In the post-genomic era, biological databases are growing at a tremendous rate. Despite rapid accumulation of biological information, functions and other biological properties of many putative gene products of various organisms remain either unknown or obscure. This paper examines how strategic integration of large biological databases and combinations of various biological information helps address some of the fundamental questions on protein structure, function and interactions. New developments in function recognition by remote homology detection and strategic use of sequence databases aid recognition of functions of newly discovered proteins. Knowledge of 3-D structures and combined use of sequences and 3-D structures of homologous protein domains expands the ability of remote homology detection enormously. The authors also demonstrate how combined consideration of functions of individual domains of multi-domain proteins helps in recognizing gross biological attributes. This paper also discusses a few cases of combining disparate biological datasets or combination of disparate biological information in obtaining new insights about protein-protein interactions across a host and a pathogen. Finally, the authors discuss how combinations of low resolution structural data, obtained using cryoEM studies, of gigantic multi-component assemblies, and atomic level 3-D structures of the components is effective in inferring finer features in the assembly.

Download Full-text

Discriminative Remote Homology Detection Using Maximal Unique Sequence Matches

Lecture Notes in Computer Science - Advances in Intelligent Data Analysis VI ◽

10.1007/11552253_26 ◽

2005 ◽

pp. 283-292 ◽

Cited By ~ 1

Author(s):

Hasan Oğul ◽

Ü. Erkan Mumcuoğlu

Keyword(s):

Unique Sequence ◽

Homology Detection ◽

Remote Homology ◽

Remote Homology Detection

Download Full-text

Partial Classifier Chains with Feature Selection by Exploiting Label Correlation in Multi-Label Classification

Entropy ◽

10.3390/e22101143 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1143

Author(s):

Zhenwu Wang ◽

Tielin Wang ◽

Benting Wan ◽

Mengjie Han

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Predictive Performance ◽

Chain Structure ◽

Classification Performance ◽

Learning Problem ◽

Feature Spaces ◽

Label Correlations ◽

Classifier Chains ◽

Label Correlation

Multi-label classification (MLC) is a supervised learning problem where an object is naturally associated with multiple concepts because it can be described from various dimensions. How to exploit the resulting label correlations is the key issue in MLC problems. The classifier chain (CC) is a well-known MLC approach that can learn complex coupling relationships between labels. CC suffers from two obvious drawbacks: (1) label ordering is decided at random although it usually has a strong effect on predictive performance; (2) all the labels are inserted into the chain, although some of them may carry irrelevant information that discriminates against the others. In this work, we propose a partial classifier chain method with feature selection (PCC-FS) that exploits the label correlation between label and feature spaces and thus solves the two previously mentioned problems simultaneously. In the PCC-FS algorithm, feature selection is performed by learning the covariance between feature set and label set, thus eliminating the irrelevant features that can diminish classification performance. Couplings in the label set are extracted, and the coupled labels of each label are inserted simultaneously into the chain structure to execute the training and prediction activities. The experimental results from five metrics demonstrate that, in comparison to eight state-of-the-art MLC algorithms, the proposed method is a significant improvement on existing multi-label classification.

Download Full-text