Relative benefits of amino-acid, codon, degeneracy, DNA, and purine-pyrimidine character coding for phylogenetic analyses of exons

Mark P. Simmons

doi:10.1111/jse.12233

Phylogenetic Analysis: Basic Concepts and Its Use as a Tool for Virology and Molecular Epidemiology

Acta Scientiae Veterinariae ◽

10.22456/1679-9216.81158 ◽

2018 ◽

Vol 44 (1) ◽

pp. 20

Author(s):

Eloiza Teles Caldart ◽

Helena Mata ◽

Cláudio Wageck Canal ◽

Ana Paula Ravazzolo

Keyword(s):

Phylogenetic Analysis ◽

Amino Acid ◽

Molecular Epidemiology ◽

Phylogenetic Analyses ◽

Phylogenetic Reconstruction ◽

Evolutionary Process ◽

Amino Acid Sequences ◽

Evolutionary Models ◽

Reconstruction Methods ◽

Basic Concepts

Background: Phylogenetic analyses are an essential part in the exploratory assessment of nucleic acid and amino acid sequences. Particularly in virology, they are able to delineate the evolution and epidemiology of disease etiologic agents and/or the evolutionary path of their hosts. The objective of this review is to help researchers who want to use phylogenetic analyses as a tool in virology and molecular epidemiology studies, presenting the most commonly used methodologies, describing the importance of the different techniques, their peculiar vocabulary and some examples of their use in virology.Review: This article starts presenting basic concepts of molecular epidemiology and molecular evolution, emphasizing their relevance in the context of viral infectious diseases. It presents a session on the vocabulary relevant to the subject, bringing readers to a minimum level of knowledge needed throughout this literature review. Within its main subject, the text explains what a molecular phylogenetic analysis is, starting from a multiple alignment of nucleotide or amino acid sequences. The different software used to perform multiple alignments may apply different algorithms. To build a phylogeny based on amino acid or nucleotide sequences it is necessary to produce a data matrix based on a model for nucleotide or amino acid replacement, also called evolutionary model. There are a number of evolutionary models available, varying in complexity according to the number of parameters (transition, transversion, GC content, nucleotide position in the codon, among others). Some papers presented herein provide techniques that can be used to choose evolutionary models. After the model is chosen, the next step is to opt for a phylogenetic reconstruction method that best fits the available data and the selected model. Here we present the most common reconstruction methods currently used, describing their principles, advantages and disadvantages. Distance methods, for example, are simpler and faster, however, they do not provide reliable estimations when the sequences are highly divergent. The accuracy of the analysis with probabilistic models (neighbour joining, maximum likelihood and bayesian inference) strongly depends on the adherence of the actual data to the chosen development model. Finally, we also explore topology confidence tests, especially the most used one, the bootstrap. To assist the reader, this review presents figures to explain specific situations discussed in the text and numerous examples of previously published scientific articles in virology that demonstrate the importance of the techniques discussed herein, as well as their judicious use.Conclusion: The DNA sequence is not only a record of phylogeny and divergence times, but also keeps signs of how the evolutionary process has shaped its history and also the elapsed time in the evolutionary process of the population. Analyses of genomic sequences by molecular phylogeny have demonstrated a broad spectrum of applications. It is important to note that for the different available data and different purposes of phylogenies, reconstruction methods and evolutionary models should be wisely chosen. This review provides theoretical basis for the choice of evolutionary models and phylogenetic reconstruction methods best suited to each situation. In addition, it presents examples of diverse applications of molecular phylogeny in virology.

Download Full-text

Single nucleotide polymorphism scanning and expression analysis of ACSL1 from different duck breeds

Canadian Journal of Animal Science ◽

10.1139/cjas-2020-0131 ◽

2021 ◽

Author(s):

Qianqian Song ◽

Zhixiu Wang ◽

Hongliang Zhang ◽

Xiangxiang Li ◽

Yang Zhang ◽

...

Keyword(s):

Amino Acid ◽

Amino Acid Sequence ◽

Protein Expression ◽

Meat Quality ◽

Liver Tissue ◽

Phylogenetic Analyses ◽

Abdominal Fat ◽

Breast Muscle ◽

Pekin Duck ◽

Lipid Deposition

Accumulating studies have indicated that the long-chain fatty acyl-CoA1 (ACSL1) gene is related to fat deposition and meat quality in mammals. However, few studies have investigated the relationship between ACSL1 and lipid deposition in ducks. To examine this, we assessed the physicochemical property, homologous alignment and phylogenetic analyses of the ACSL1 amino acid sequence using bioinformatics tools. The analysis indicated that the ACSL1 amino acid sequence varies in animals, and the duck ACSL1 protein is most closely related to that of chicken. Two SNP sites were identified at 1749 and 1905 bp of the coding region of ACSL1 by sequencing. Quantitative real-time PCR and western blotting were used to measure mRNA and protein levels in abdominal fat, breast muscle and liver tissue of Pekin duck (BD) and Cherry Valley duck (CD). mRNA and protein expression were significantly higher in BD than in CD in abdominal fat and liver tissue (P < 0.05). In breast muscle, the mRNA level of ACSL1 was also significantly higher in BD than in CD (P < 0.05), and protein expression in BD tended to be higher than that of CD. These results suggest that ACSL1 may contribute to lipid deposition and meat quality in ducks.

Download Full-text

Nightshade curly top virus: A possible new virus of the genus Topocuvirus infecting Solanum nigrum in China

Plant Disease ◽

10.1094/pdis-03-20-0572-re ◽

2020 ◽

Author(s):

Kai Sun ◽

Yan Liang ◽

Xueting Zhong ◽

Xuenan Hu ◽

Pengjun Zhang ◽

...

Keyword(s):

Amino Acid ◽

Phylogenetic Analyses ◽

Solanum Nigrum ◽

Open Reading Frames ◽

Small Interfering Rnas ◽

Protein Amino Acid ◽

Sequence Identity ◽

Pathogenic Viruses ◽

Leaf Deformation ◽

Symptomatic Sample

Virus-like symptoms, including leaf deformation and curling, were observed on nightshade (Solanum nigrum) in Zhejiang province, China. To identify possible pathogenic viruses or viroids, a symptomatic sample was subjected to deep sequencing of small interfering RNAs. Assembly of the resulting sequences led to identification of a novel geminivirus, provisionally designated nightshade curly top virus (NCTV). The complete genomic DNA sequence is 2,867 nucleotides that encodes seven open reading frames. NCTV shares 77.1 % overall nucleotide sequence identity, 86.3 % coat protein amino acid, and 78.9 % replication-associated protein amino acid sequence identity with Topocuvirus tomato pseudo-curly top virus (TPCTV). Polymerase chain reaction screening of nightshade field isolates indicated that NCTV is widely distributed in Zhejiang. Agrobacterium-mediated inoculation revealed that NCTV is highly infectious to Nicotiana benthamiana, Solanum nigrum, Solanum lycopersicum, and Solanum tuberosum. Based on pairwise comparisons and phylogenetic analyses, NCTV is proposed as a provisional member of the genus Topocuvirus.

Download Full-text

Genetic and Antigenic Evolution of European Swine Influenza A Viruses of HA-1C (Avian-Like) and HA-1B (Human-Like) Lineages in France from 2000 to 2018

Viruses ◽

10.3390/v12111304 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1304

Author(s):

Amélie Chastagner ◽

Séverine Hervé ◽

Stéphane Quéguiner ◽

Edouard Hirchaud ◽

Pierrick Lucas ◽

...

Keyword(s):

Amino Acid ◽

Influenza A ◽

Phylogenetic Analyses ◽

Swine Influenza ◽

Amino Acid Sequences ◽

Antigenic Drift ◽

Influenza A Viruses ◽

Double Deletion ◽

Antigenic Evolution ◽

Amino Acid Mutations

This study evaluated the genetic and antigenic evolution of swine influenza A viruses (swIAV) of the two main enzootic H1 lineages, i.e., HA-1C (H1av) and -1B (H1hu), circulating in France between 2000 and 2018. SwIAV RNAs extracted from 1220 swine nasal swabs were hemagglutinin/neuraminidase (HA/NA) subtyped by RT-qPCRs, and 293 virus isolates were sequenced. In addition, 146 H1avNy and 105 H1huNy strains were submitted to hemagglutination inhibition tests. H1avN1 (66.5%) and H1huN2 (25.4%) subtypes were predominant. Most H1 strains belonged to HA-1C.2.1 or -1B.1.2.3 clades, but HA-1C.2, -1C.2.2, -1C.2.3, -1B.1.1, and -1B.1.2.1 clades were also detected sporadically. Within HA-1B.1.2.3 clade, a group of strains named “Δ146-147” harbored several amino acid mutations and a double deletion in HA, that led to a marked antigenic drift. Phylogenetic analyses revealed that internal segments belonged mainly to the “Eurasian avian-like lineage”, with two distinct genogroups for the M segment. In total, 17 distinct genotypes were identified within the study period. Reassortments of H1av/H1hu strains with H1N1pdm virus were rarely evidenced until 2018. Analysis of amino acid sequences predicted a variability in length of PB1-F2 and PA-X proteins and identified the appearance of several mutations in PB1, PB1-F2, PA, NP and NS1 proteins that could be linked to virulence, while markers for antiviral resistance were identified in N1 and N2. Altogether, diversity and evolution of swIAV recall the importance of disrupting the spreading of swIAV within and between pig herds, as well as IAV inter-species transmissions.

Download Full-text

Anopheles stephensi Dual Oxidase Silencing Activates the Thioester-Containing Protein 1 Pathway to Suppress Plasmodium Development

Journal of Innate Immunity ◽

10.1159/000497417 ◽

2019 ◽

Vol 11 (6) ◽

pp. 496-505 ◽

Cited By ~ 2

Author(s):

Parik Kakani ◽

Mithilesh Kajla ◽

Tania Pal Choudhury ◽

Lalita Gupta ◽

Sanjeev Kumar

Keyword(s):

Amino Acid ◽

Calcium Binding ◽

Anopheles Stephensi ◽

Developmental Stages ◽

Transmembrane Protein ◽

Phylogenetic Analyses ◽

Amino Acid Identity ◽

Infected Mosquito ◽

Parasite Development ◽

Dual Oxidase

We characterized the dual oxidase (Duox) gene in the major Indian malaria vector Anopheles stephensi, which regulates the generation of reactive oxygen species. The AsDuox gene encodes for a 1,475-amino-acid transmembrane protein that contains an N-terminal noncytoplasmic heme peroxidase domain, a calcium-binding domain, seven transmembrane domains, and a C-terminal cytoplasmic NADPH domain. Phylogenetic analyses revealed that A. stephensi Duox protein is highly conserved and shares 97–100% amino acid identity with other anopheline Duoxes. AsDuox is expressed in all the developmental stages of A. stephensi and the pupal stages revealed relatively higher expressions. The Duox gene is induced in Plasmodium-infected mosquito midguts, and RNA interference-mediated silencing of this gene suppressed parasite development through activation of the thioester-containing protein 1 pathway. We propose that this highly conserved anopheline Duox, being a Plasmodium agonist, is an excellent target to control malaria parasite development inside the insect host.

Download Full-text

First detection and characterisation of porcine hemagglutinating encephalomyelitis virus in the Czech Republic

Veterinární Medicína ◽

10.17221/95/2018-vetmed ◽

2019 ◽

Vol 64 (No. 02) ◽

pp. 60-66

Author(s):

R Moutelikova ◽

J Prodelalova

Keyword(s):

Czech Republic ◽

Amino Acid ◽

Amino Acid Sequence ◽

Phylogenetic Analyses ◽

Economic Losses ◽

The Czech Republic ◽

Nucleocapsid Gene ◽

Nasal Swabs ◽

Encephalomyelitis Virus ◽

Porcine Hemagglutinating Encephalomyelitis Virus

Porcine hemagglutinating encephalomyelitis virus (PHEV) is a highly neurovirulent coronavirus that invades the central nervous system in piglets. The incidence of PHEV among pigs in many countries is rising, and the economic losses to the pig industry may be significant. Serological studies suggest that PHEV is spread worldwide. However, no surveillance has been carried out in the Czech Republic. In this study, eight pig farms were screened for the presence of members of the Coronaviridae family with the use of reverse transcription PCR. A collection of 123 faecal samples and 151 nasal swabs from domestic pigs were analysed. In PHEV-positive samples, almost the complete coding sequence of the nucleocapsid gene was amplified and the acquired sequences were compared to those of geographically dispersed PHEV strains; phylogenetic analyses were also performed. PHEV was present in 7.9% of nasal swabs taken from different age categories of pigs. No other swine coronaviruses were detected. The amino acid sequence of the Czech PHEV strains showed 95.8–98.1% similarity to other PHEV reference strains in GenBank. PHEV strains collected from animals on the same farm were identical; however, strains from different farms have only exhibited only 96.7–98.7% amino acid sequence identity. Our study demonstrates the presence of PHEV in pigs in the Czech Republic. The Czech PHEV strains were evolutionarily closest to the Belgium strain VW572.

Download Full-text

PPK1 and PPK2 — which polyphosphate kinase is older?

Biologia ◽

10.2478/s11756-013-0324-x ◽

2014 ◽

Vol 69 (3) ◽

Cited By ~ 8

Author(s):

Lucia Achbergerová ◽

Jozef Nahálka

Keyword(s):

Escherichia Coli ◽

Pseudomonas Aeruginosa ◽

Amino Acid ◽

Phylogenetic Analyses ◽

Linear Molecule ◽

Polyphosphate Kinase ◽

Hypothetical Proteins ◽

Bacterial Genomes ◽

Metabolic Balance ◽

Domains Of Life

AbstractPolyphosphate kinases (PPKs) catalyse the polymerisation and degradation of polyphosphate chains. As a result of this process, PPK produces or consumes energy in the form of ATP. Polyphosphate is a linear molecule that contains tens to hundreds of phosphate residues connected by macroergic bonds, and it appears to be an easily obtainable and rich source of energy from prebiotic times to the present. Notably, polyphosphate is present in the cells of all three domains of life, but PPKs are widely distributed only in Bacteria, as Archaea and Eucarya use various unrelated or “nonhomologous” proteins for energy and metabolic balance. The present study focuses on PPK1 and PPK2 homologues, which have been described to some extent in Bacteria, and the aim was to determine which homologue group, PPK1 or PPK2, is older. Phylogenetic analyses of 109 sequence homologues of Escherichia coli PPK1 and 109 sequence homologues of Pseudomonas aeruginosa PPK2 from 109 bacterial genomes imply that polyphosphate consumption (PPK2) evolved first and that phosphate polymerisation (PPK1) evolved later. Independently, a theory of the trends in amino acid loss and gain also confirms that PPK2 is older than PPK1. According to the results of this study, we propose 68 hypothetical proteins to mark as PPK2 homologues and 3 hypothetical proteins to mark as PPK1 homologues.

Download Full-text

The salivary transcriptome of Limnobdella mexicana (Annelida: Clitellata: Praobdellidae) and orthology determination of major leech anticoagulants

Parasitology ◽

10.1017/s0031182019000593 ◽

2019 ◽

Vol 146 (10) ◽

pp. 1338-1346

Author(s):

Rafael Iwama ◽

Alejandro Oceguera-Figueroa ◽

Gonzalo Giribet ◽

Sebastian Kvist

Keyword(s):

Amino Acid ◽

Positive Relationship ◽

Phylogenetic Analyses ◽

Factor Xa ◽

Amino Acid Conservation ◽

Inhibit Factor

AbstractBloodfeeding requires several adaptations that allow the parasite to feed efficiently. Leeches and other hematophagous animals have developed different mechanisms to inhibit hemostasis, one of the main barriers imposed by their hosts. Limnobdella mexicana is a member of the leech family Praobdellidae, a family of host generalists known for their preference to attach on mucosal membranes of mammals, such as those in nasopharyngeal cavities, bladders and ocular orbits. Previous studies have hypothesized a positive relationship between diversity of anticoagulants and diversity of hosts in bloodfeeding leeches. However, orthology determination of putative anticoagulants and the lack of standardization of sequencing effort and method hinder comparisons between publicly available transcriptomes generated in different laboratories. In the present study, we examine the first transcriptome of a praobdellid leech and identify 15 putative anticoagulants using a phylogeny-based inference approach, amino-acid conservation, Pfam domains and BLAST searches. Our phylogenetic analyses suggest that the ancestral leech was able to inhibit factor Xa and that some hirudins that have been reported in previous studies on leech anticoagulants may not be orthologous with the archetypal hirudin.

Download Full-text

A molecular method for a qualitative analysis of potentially coding sequences of DNA

Brazilian Journal of Biology ◽

10.1590/s1519-69842004000300003 ◽

2004 ◽

Vol 64 (3a) ◽

pp. 383-398

Author(s):

M. L. Christoffersen ◽

M. E. Araújo ◽

M. A. M. Moreira

Keyword(s):

Amino Acid ◽

Phylogenetic Analyses ◽

A Priori ◽

Morphological Data ◽

Computer Algorithms ◽

Protein Coding ◽

Tree Topologies ◽

Alternative Amino Acid ◽

Cladistic Analyses ◽

Transformation Series

Total sequence phylogenies have low information content. Ordinary misconceptions are that character quality can be ignored and that relying on computer algorithms is enough. Despite widespread preference for a posteriori methods of character evaluation, a priori methods are necessary to produce transformation series that are independent of tree topologies. We propose a stepwise qualitative method for analyzing protein sequences. Informative codons are selected, alternative amino acid transformation series are analyzed, and most parsimonious transformations are hypothesized. We conduct four phylogenetic analyses of philodryanine snakes. The tree based on all nucleotides produces least resolution. Trees based on the exclusion of third positions, on an asymmetric step matrix, and on our protocol, produce similar results. Our method eliminates noise by hypothesizing explicit transformation series for each informative protein-coding amino acid. This approaches qualitative methods for morphological data, in which only characters successfully interpreted in a phylogenetic context are used in cladistic analyses. The method allows utilizing character information contained in the original sequence alignment and, therefore, has higher resolution in inferring a phylogenetic tree than some traditional methods (such as distance methods).

Download Full-text

Genetic and antigenic diversity among noroviruses

Journal of General Virology ◽

10.1099/vir.0.81532-0 ◽

2006 ◽

Vol 87 (4) ◽

pp. 909-919 ◽

Cited By ~ 115

Author(s):

Grant S. Hansman ◽

Katsuro Natori ◽

Haruko Shirato-Horikoshi ◽

Satoko Ogawa ◽

Tomoichiro Oka ◽

...

Keyword(s):

Amino Acid ◽

Insect Cells ◽

Phylogenetic Analyses ◽

Cross Reactivity ◽

Amino Acid Sequences ◽

Amino Acid Residues ◽

Cell Culture Systems ◽

Antibody Elisa ◽

Polyclonal Antisera ◽

Antigenic Relationships

Human norovirus (NoV) strains cause a considerable number of outbreaks of gastroenteritis worldwide. Based on their capsid gene (VP1) sequence, human NoV strains can be grouped into two genogroups (GI and GII) and at least 14 GI and 17 GII genotypes (GI/1–14 and GII/1–17). Human NoV strains cannot be propagated in cell-culture systems, but expression of recombinant VP1 in insect cells results in the formation of virus-like particles (VLPs). In order to understand NoV antigenic relationships better, cross-reactivity among 26 different NoV VLPs was analysed. Phylogenetic analyses grouped these NoV strains into six GI and 12 GII genotypes. An antibody ELISA using polyclonal antisera raised against these VLPs was used to determine cross-reactivity. Antisera reacted strongly with homologous VLPs; however, a number of novel cross-reactivities among different genotypes was observed. For example, GI/11 antiserum showed a broad-range cross-reactivity, detecting two GI and 10 GII genotypes. Likewise, GII/1, GII/10 and GII/12 antisera showed a broad-range cross-reactivity, detecting several other distinct GII genotypes. Alignment of VP1 amino acid sequences suggested that these broad-range cross-reactivities were due to conserved amino acid residues located within the shell and/or P1-1 domains. However, unusual cross-reactivities among different GII/3 antisera were found, with the results indicating that both conserved amino acid residues and VP1 secondary structures influence antigenicity.

Download Full-text