SHOOT: phylogenetic gene search and ortholog inference

LegumeDB: Development of Legume Medicinal Plant Database and Comparative Molecular Evolutionary Analysis of matK Proteins of Legumes and Mangroves

Current Nutrition & Food Science ◽

10.2174/1573401314666180223143523 ◽

2019 ◽

Vol 15 (4) ◽

pp. 353-362

Author(s):

Sambhaji B. Thakar ◽

Maruti J. Dhanavade ◽

Kailas D. Sonawane

Keyword(s):

Phylogenetic Analysis ◽

Medicinal Plants ◽

Homology Modeling ◽

Sequence Alignment ◽

Vigna Unguiculata ◽

Multiple Sequence Alignment ◽

Legume Species ◽

Mangrove Species ◽

Multiple Sequence ◽

Thespesia Populnea

Background: Legume plants are known for their rich medicinal and nutritional values. Large amount of medicinal information of various legume plants have been dispersed in the form of text. Objective: It is essential to design and construct a legume medicinal plants database, which integrate respective classes of legumes and include knowledge regarding medicinal applications along with their protein/enzyme sequences. Methods: The design and development of Legume Medicinal Plants Database (LegumeDB) has been done by using Microsoft Structure Query Language Server 2017. DBMS was used as back end and ASP.Net was used to lay out front end operations. VB.Net was used as arranged program for coding. Multiple sequence alignment, phylogenetic analysis and homology modeling techniques were also used. Results: This database includes information of 50 Legume medicinal species, which might be helpful to explore the information for researchers. Further, maturase K (matK) protein sequences of legumes and mangroves were retrieved from NCBI for multiple sequence alignment and phylogenetic analysis to understand evolutionary lineage between legumes and mangroves. Homology modeling technique was used to determine three-dimensional structure of matK from Legume species i.e. Vigna unguiculata using matK of mangrove species, Thespesia populnea as a template. The matK sequence analysis results indicate the conserved residues among legume and mangrove species. Conclusion: Phylogenetic analysis revealed closeness between legume species Vigna unguiculata and mangrove species Thespesia populnea to each other, indicating their similarity and origin from common ancestor. Thus, these studies might be helpful to understand evolutionary relationship between legumes and mangroves. : LegumeDB availability: http://legumedatabase.co.in

Download Full-text

Multiple Sequence Alignment Optimization Using Meta-Heuristic Techniques

Data Analytics in Medicine ◽

10.4018/978-1-7998-1204-3.ch031 ◽

2020 ◽

pp. 565-579 ◽

Cited By ~ 1

Author(s):

Mohamed Issa ◽

Aboul Ella Hassanien

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Trees ◽

Pairwise Alignment ◽

Accurate Method ◽

Alignment Algorithm ◽

Bacterial Foraging Optimization ◽

Multiple Sequence ◽

Speed Up ◽

Dna Fragment Assembly

Sequence alignment is a vital process in many biological applications such as Phylogenetic trees construction, DNA fragment assembly and structure/function prediction. Two kinds of alignment are pairwise alignment which align two sequences and Multiple Sequence alignment (MSA) that align sequences more than two. The accurate method of alignment is based on Dynamic Programming (DP) approach which suffering from increasing time exponentially with increasing the length and the number of the aligned sequences. Stochastic or meta-heuristics techniques speed up alignment algorithm but with near optimal alignment accuracy not as that of DP. Hence, This chapter aims to review the recent development of MSA using meta-heuristics algorithms. In addition, two recent techniques are focused in more deep: the first is Fragmented protein sequence alignment using two-layer particle swarm optimization (FTLPSO). The second is Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm (MO-BFO).

Download Full-text

Multiple Sequence Alignment and Phylogenetic Trees

Encyclopedia of Biological Chemistry ◽

10.1016/b0-12-443710-9/00413-0 ◽

2004 ◽

pp. 770-774

Author(s):

Russell F. Doolittle

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Trees ◽

Multiple Sequence

Download Full-text

Multiple sequence alignment and reconstructing phylogenetic trees with Hadoop

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2016.7822735 ◽

2016 ◽

Author(s):

Quan Zou

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Trees ◽

Multiple Sequence

Download Full-text

ProbPFP: a multiple sequence alignment algorithm combining hidden Markov model optimized by particle swarm optimization with partition function

BMC Bioinformatics ◽

10.1186/s12859-019-3132-7 ◽

2019 ◽

Vol 20 (S18) ◽

Cited By ~ 1

Author(s):

Qing Zhan ◽

Nan Wang ◽

Shuilin Jin ◽

Renjie Tan ◽

Qinghua Jiang ◽

...

Keyword(s):

Partition Function ◽

Markov Model ◽

Hidden Markov Model ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Trees ◽

Hidden Markov ◽

Particle Swarm ◽

Alignment Algorithm ◽

Multiple Sequence

Abstract Background During procedures for conducting multiple sequence alignment, that is so essential to use the substitution score of pairwise alignment. To compute adaptive scores for alignment, researchers usually use Hidden Markov Model or probabilistic consistency methods such as partition function. Recent studies show that optimizing the parameters for hidden Markov model, as well as integrating hidden Markov model with partition function can raise the accuracy of alignment. The combination of partition function and optimized HMM, which could further improve the alignment’s accuracy, however, was ignored by these researches. Results A novel algorithm for MSA called ProbPFP is presented in this paper. It intergrate optimized HMM by particle swarm with partition function. The algorithm of PSO was applied to optimize HMM’s parameters. After that, the posterior probability obtained by the HMM was combined with the one obtained by partition function, and thus to calculate an integrated substitution score for alignment. In order to evaluate the effectiveness of ProbPFP, we compared it with 13 outstanding or classic MSA methods. The results demonstrate that the alignments obtained by ProbPFP got the maximum mean TC scores and mean SP scores on these two benchmark datasets: SABmark and OXBench, and it got the second highest mean TC scores and mean SP scores on the benchmark dataset BAliBASE. ProbPFP is also compared with 4 other outstanding methods, by reconstructing the phylogenetic trees for six protein families extracted from the database TreeFam, based on the alignments obtained by these 5 methods. The result indicates that the reference trees are closer to the phylogenetic trees reconstructed from the alignments obtained by ProbPFP than the other methods. Conclusions We propose a new multiple sequence alignment method combining optimized HMM and partition function in this paper. The performance validates this method could make a great improvement of the alignment’s accuracy.

Download Full-text

PENGELOMPOKKAN SEPULUH VARIETAS TEMBAKAU (Nicotiana tabacum) BERDASARKAN KERAGAMAN RUNUTAN BASA PARSIAL GEN PMT(PUTRESCINE N-METHYLTRANSFERASE) Clustering of ten tobacco (Nicotiana tabacum) varieties based on the partial PMT (putrescine N-methyltranfera

Jurnal Penelitian Tanaman Industri ◽

10.21082/littri.v23n1.2017.36-44 ◽

2017 ◽

Vol 23 (1) ◽

pp. 36

Author(s):

Sesanti Basuki ◽

Sudarsono Sudarsono

Keyword(s):

Nicotiana Tabacum ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Gene Sequence ◽

Gene Sequences ◽

Multiple Sequence ◽

Multiple Sequence Alignment Analysis ◽

Nicotine Level ◽

Alignment Analysis ◽

The Relationship

AbstrakGen PMT adalah gen penyandi enzim putresina N-metiltransferase (PMT) yang berperan dalam lintasan biosintesis nikotin pada tanaman tembakau (Nicotiana tabacum). Sepuluh varietas tembakau yang memiliki perbedaan tingkat kadar nikotin diuji untuk mempelajari: (1) keragaman runutan basa parsial gen PMT dari masing-masing varietas, dan (2) kekerabatan antara sepuluh varietas tembakau yang diuji berdasarkan keragaman runutan basa parsial gen PMT. Keragaman runutan basa dianalisis dengan mensejajarkan data runutan basa dari sepuluh varietas tembakau yang diuji dengan runutan basa dari Ntpmt_Sindoro1 (JQ438825) yang telah tersimpan dalam database genbank NCBI. Hasil pensejajaran digunakan untuk menghitung matriks jarak, yang selanjutnya digunakan untuk menganalisis hubungan kekerabatan diantara sepuluh varietas tembakau. Hasil analisis memperlihatkan adanya variasi ukuran dan jumlah runutan basaparsial gen PMT asal sepuluh varietas tembakau yang dianalisis. Hasil analisis juga memperlihatkan bahwa runutan basa parsial gen PMT tersebut berasal/diturunkan dari sumber (ancestor) yang sama dan terkait dengan biosintesis nikotin pada tembakau. Runutan basaparsial gen PMT dari sepuluh varietas yang dianalisis memisahkan antara kelompok tembakau introduksi (kadar nikotin rendah-sedang) dengan kelompok tembakau lokal (kadar nikotin sedang-tinggi). Dua kelompok memisah berdasarkan level kadar nikotin, danperbedaan/perubahan susunan basa pada situs-situs tertentu dari runutan basaparsial gen PMT yang dianalisis. Informasi tentang mutasi yang terjadi pada situs-situs runutan basa dari parsial gen PMT dapat digunakan untuk mempelajari keterkaitan antara perubahan basa pada fragmen gen PMT dengan kandungan nikotin total tembakau yang terjadi selama proses evolusi.Kata kunci: Analisis pengelompokkan, gen PMT,Nikotin, Nicotiana tabacum Abstract PMT gene is the gene encoded putrescine N-methiltransferase which is related to nicotine biosinthesis in tobacco (Nicotiana tabacum). Ten tobacco varieties with different nicotine level were used inthis study. The aims of this study were: (1) to analyze thepartial PMT gene sequence diversity among ten tobacco varieties, and (2) to evaluate the closed-relationship amongten tobacco varieties based on their partialPMT gene sequences diversity.Sequence diversity was analyzed by multiple sequence alignment between the partialPMT gene sequence of the ten tobacco varietiesand Ntpmt_Sindoro1 sequence deposited in the NCBI gene-bank database.The phylogenetic relationship amongthe sequences was inferred by genetic distancebetween pairs of sequences using the pairwise and multiple sequence alignment analysis. Analysis of the sequences showed that all varieties analyzed had varied in size and number of the PMT gene fragments yielded. The analysis also revealed that thepartialPMT gene sequencesarecoming from the same ancestor which related to nicotine biosynthesis in tobacco. Phylogenetic analysis separated the partialPMT gene sequences into two different branches significantly (bootstrap value = 100), and clustered together based on tobacco types with different nicotine level in whichcould be due to some baseschanged on the specific sites of thePMT gene sequences. This information could be used to study the relationship between some bases changed on the specific sites of thePMT gene sequences and the nicotine content variation yielded by the ten tobacco varieties that is happened during evolution time.Key words: Clustering analysis, PMT gene, nicotine, Nicotiana tabacum

Download Full-text

Gotree/Goalign : Toolkit and Go API to facilitate the development of phylogenetic workflows

10.1101/2021.06.09.447704 ◽

2021 ◽

Author(s):

Frederic Lemoine ◽

Olivier Gascuel

Keyword(s):

Phylogenetic Analysis ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Analyses ◽

Bootstrap Support ◽

Complex Task ◽

Important Data ◽

Multiple Sequence ◽

Tree Comparison ◽

User Friendly

Besides computer intensive steps, phylogenetic analysis workflows are usually composed of many small, reccuring, but important data manipulations steps. Among these, we can find file reformatting, sequence renaming, tree re-rooting, tree comparison, bootstrap support computation, etc. These are often performed by custom scripts or by several heterogeneous tools, which may be error prone, uneasy to maintain and produce results that are challenging to reproduce. For all these reasons, the development and reuse of phylogenetic workflows is often a complex task. We identified many operations that are part of most phylogenetic analyses, and implemented them in a toolkit called Gotree/Goalign. The Gotree/Goalign toolkit implements more than 120 user-friendly commands and an API dedicated to multiple sequence alignment and phylogenetic tree manipulations. It is developed in Go, which makes executables efficient, easily installable, integrable in workflow environments, and parallelizable when possible. This toolkit is freely available on most platforms (Linux, MacOS and Windows) and most architectures (amd64, i386). Sources and binaries are available on GitHub at https://github.com/evolbioinfo/{gotree|goalign} , Bioconda, and DockerHub.

Download Full-text

Multiple Sequence Alignment Optimization Using Meta-Heuristic Techniques

Handbook of Research on Machine Learning Innovations and Trends - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-2229-4.ch018 ◽

2017 ◽

pp. 409-423 ◽

Cited By ~ 2

Author(s):

Mohamed Issa ◽

Aboul Ella Hassanien

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Trees ◽

Pairwise Alignment ◽

Accurate Method ◽

Alignment Algorithm ◽

Bacterial Foraging Optimization ◽

Multiple Sequence ◽

Speed Up ◽

Dna Fragment Assembly

Sequence alignment is a vital process in many biological applications such as Phylogenetic trees construction, DNA fragment assembly and structure/function prediction. Two kinds of alignment are pairwise alignment which align two sequences and Multiple Sequence alignment (MSA) that align sequences more than two. The accurate method of alignment is based on Dynamic Programming (DP) approach which suffering from increasing time exponentially with increasing the length and the number of the aligned sequences. Stochastic or meta-heuristics techniques speed up alignment algorithm but with near optimal alignment accuracy not as that of DP. Hence, This chapter aims to review the recent development of MSA using meta-heuristics algorithms. In addition, two recent techniques are focused in more deep: the first is Fragmented protein sequence alignment using two-layer particle swarm optimization (FTLPSO). The second is Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm (MO-BFO).

Download Full-text

An Alignment-free Method for Phylogeny Estimation using Maximum Likelihood

10.1101/2019.12.13.875526 ◽

2019 ◽

Author(s):

Tasfia Zahin ◽

Md. Hasin Abrar ◽

Mizanur Rahman ◽

Tahrina Tasnim ◽

Md. Shamsuzzoha Bayzid ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Computational Complexity ◽

Maximum Likelihood ◽

Phylogenetic Tree ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence ◽

Alignment Free ◽

Phylogeny Estimation ◽

Statistical Approaches

AbstractPhylogenetic analysis i.e. construction of an accurate phylogenetic tree from genomic sequences of a set of species is one of the main challenges in bioinformatics. The popular approaches to this require aligning each pair of sequences to calculate pairwise distances or aligning all the sequences to construct a multiple sequence alignment. The computational complexity and difficulties in getting accurate alignments have led to development of alignment-free methods to estimate phylogenies. However, the alignment free approaches focus on computing distances between species and do not utilize statistical approaches for phylogeny estimation. Herein, we present a simple alignment free method for phylogeny construction based on contiguous sub-sequences of length k termed k-mers. The presence or absence of these k-mers are used to construct a phylogeny using a maximum likelihood approach. The results suggest our method is competitive with other alignment-free approaches, while outperforming them in some cases.

Download Full-text

VP2 Gene-Based Molecular Evolutionary Patterns of Major Circulating Bluetongue Virus Serotypes Isolated during 2014–2018 from Telangana and Andhra Pradesh States of India

Intervirology ◽

10.1159/000512131 ◽

2020 ◽

pp. 1-8

Author(s):

Ravali Thota ◽

Vishweshwar Kumar Ganji ◽

Sharanya Machanagari ◽

Narasimha Reddy Yella ◽

Bhagyalakshmi Buddala ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Bluetongue Virus ◽

Andhra Pradesh ◽

Geographical Location ◽

Effective Control ◽

Effective Vaccine ◽

Multiple Sequence ◽

Vp2 Gene

Introduction: Bluetongue disease is an economically important viral disease of livestock caused by bluetongue virus (BTV) having multiple serotypes. It belongs to the genus Orbivirus of family Reoviridae and subfamily Sedoreovirinae. The genome of BTV is 10 segmented dsRNA that codes for 7 structural and 4 nonstructural proteins, of which VP2 was reported to be serotype-specific and a major antigenic determinant. Objective: It is important to know the circulating serotypes in a particular geographical location for effective control of the disease. The present study unravels the molecular evolution of the circulating BTV serotypes during 2014–2018 in Telangana and Andhra Pradesh states of India. Methods: Multiple sequence alignment with available BTV serotypes in GenBank and phylogenetic analysis were performed for the partial VP2 sequences of major circulating BTV serotypes during the study period. Results: The multiple sequence alignment of circulating serotypes with respective reference isolates revealed variations in antigenic VP2. The phylogenetic analysis revealed that the major circulating serotypes were grouped into eastern topotypes (BTV-1, BTV-2, BTV-4, and BTV-16) and Western topotypes (BTV-5, BTV-12, and BTV-24). Conclusion: Our study strengthens the need for development of an effective vaccine, which can induce the immune response for a range of serotypes within and in between topotypes.

Download Full-text