A novel graphical representation and similarity analysis of protein sequences based on physicochemical properties

2018 ◽  
Vol 510 ◽  
pp. 477-485 ◽  
Author(s):  
Mehri Mahmoodi-Reihani ◽  
Fatemeh Abbasitabar ◽  
Vahid Zare-Shahabadi
2010 ◽  
Vol 31 (11) ◽  
pp. 2136-2142 ◽  
Author(s):  
Ping-An He ◽  
Yan-Ping Zhang ◽  
Yu-Hua Yao ◽  
Yi-Fa Tang ◽  
Xu-Ying Nan

Author(s):  
Jiahe Huang ◽  
Qi Dai ◽  
Yuhua Yao ◽  
Ping-An He

Aim and Objective: The similarities comparison of biological sequences is the important task in bioinformatics. The methods of the similarities comparison for biological sequences are divided into two classes: sequence alignment method and alignment-free method. The graphical representation of biological sequences is a kind of alignment-free methods, which constitutes a tool for analyzing and visualizing the biological sequences. In this article, a generalized iterative map of protein sequences was suggested to analyze the similarities of biological sequences. Materials and Methods: Based on the normalized physicochemical indexes of 20 amino acids, each amino acid can be mapped into a point in 5D space. A generalized iterative function system was introduced to outline a generalized iterative map of protein sequences, which can not only reflect various physicochemical properties of amino acids but also incorporate with different compression ratios of component of generalized iterative map. Several properties were proved to illustrate the advantage of generalized iterative map. The mathematical description of generalized iterative map was suggested to compare the similarities and dissimilarities of protein sequences. Based on this method, similarities/dissimilarities were compared among ND5 proteins sequences, as well as ND6 protein sequences of ten different species. Results: By correlation analysis, the ClustalW results were compared with our similarity/dissimilarity results and other graphical representation results to show the utility of our approach. The comparison results show that our approach has better correlations with ClustalW for all species than other approaches and illustrate the effectiveness of our approach. Conclusion: Two examples show that our method not only has good performances and effects in the similarity/dissimilarity analysis of protein sequences but also does not require complex computation.


2020 ◽  
Vol 15 (2) ◽  
pp. 121-134 ◽  
Author(s):  
Eunmi Kwon ◽  
Myeongji Cho ◽  
Hayeon Kim ◽  
Hyeon S. Son

Background: The host tropism determinants of influenza virus, which cause changes in the host range and increase the likelihood of interaction with specific hosts, are critical for understanding the infection and propagation of the virus in diverse host species. Methods: Six types of protein sequences of influenza viral strains isolated from three classes of hosts (avian, human, and swine) were obtained. Random forest, naïve Bayes classification, and knearest neighbor algorithms were used for host classification. The Java language was used for sequence analysis programming and identifying host-specific position markers. Results: A machine learning technique was explored to derive the physicochemical properties of amino acids used in host classification and prediction. HA protein was found to play the most important role in determining host tropism of the influenza virus, and the random forest method yielded the highest accuracy in host prediction. Conserved amino acids that exhibited host-specific differences were also selected and verified, and they were found to be useful position markers for host classification. Finally, ANOVA analysis and post-hoc testing revealed that the physicochemical properties of amino acids, comprising protein sequences combined with position markers, differed significantly among hosts. Conclusion: The host tropism determinants and position markers described in this study can be used in related research to classify, identify, and predict the hosts of influenza viruses that are currently susceptible or likely to be infected in the future.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Ajay Kumar Saw ◽  
Binod Chandra Tripathy ◽  
Soumyadeep Nandi

Sign in / Sign up

Export Citation Format

Share Document