scholarly journals Evolutionary History of Cotranscriptional Editing in the Paramyxoviral Phosphoprotein Gene

2021 ◽  
Author(s):  
Jordan Douglas ◽  
Alexei J Drummond ◽  
Richard L Kingston

Abstract The phosphoprotein gene of the paramyxoviruses encodes multiple protein products. The P, V, and W proteins are generated by transcriptional slippage. This process results in the insertion of non-templated guanosine nucleosides into the mRNA at a conserved edit site. The P protein is an essential component of the viral RNA polymerase, and is encoded by a faithful copy of the gene in the majority of paramyxoviruses. However, in some cases the non essential V protein is encoded by default and guanosines must be inserted into the mRNA in order to encode P. The number of guanosines inserted into the P gene can be described by a probability distribution which varies between viruses. In this article we review the nature of these distributions, which can be inferred from mRNA sequencing data, and reconstruct the evolutionary history of cotranscriptional editing in the paramyxovirus family. Our model suggests that, throughout known history of the family, the system has switched from a P default to a V default mode four times; complete loss of the editing system has occurred twice, the canonical zinc finger domain of the V protein has been deleted or heavily mutated a further two times, and the W protein has independently evolved a novel function three times. Finally, we review the physical mechanisms of cotranscriptional editing via slippage of the viral RNA polymerase.

2020 ◽  
Author(s):  
Jordan Douglas ◽  
Alexei J. Drummond ◽  
Richard L. Kingston

AbstractThe phosphoprotein gene of the paramyxoviruses encodes multiple protein products. The P, V, and W proteins are generated by transcriptional slippage. This process results in the insertion of non-templated guanosine nucleosides into the mRNA at a conserved edit site. The P protein is an essential component of the viral RNA polymerase, and is encoded by a direct copy of the gene in the majority of paramyxoviruses. However, in some cases the non-essential V protein is encoded by default and guanosines must be inserted into the mRNA in order to encode P. The number of guanosines inserted can be described by a probability distribution which varies between viruses. In this article we review the nature of these distributions, which can be inferred from mRNA sequencing data, and reconstruct the evolutionary history of cotranscriptional editing in the paramyxovirus family. Our model suggests that, throughout known history of the family, the system has switched from a P default to a V default mode four times; complete loss of the editing system has occurred twice, the canonical zinc finger domain of the V protein has been deleted or heavily mutated a further two times, and the W protein has independently evolved a novel function three times. Finally, we review the physical mechanisms of cotranscriptional editing via slippage of the viral RNA polymerase.


Author(s):  
Olga Kozhar ◽  
Mee-Sook Kim ◽  
Jorge Ibarra Caballero ◽  
Ned Klopfenstein ◽  
Phil Cannon ◽  
...  

Emerging pathogens have been increasing exponentially over the last century. The knowledge on whether these organisms are native to ecosystems or have been recently introduced is often of great importance. Understanding the ecological and evolutionary processes promoting emergence can help to control their spread and forecast epidemics. Using restriction site-associated DNA sequencing data, we studied genetic relationships, pathways of spread, and evolutionary history of Phellinus noxius, an emerging root-rotting fungus of unknown origin, in eastern Asia, Australia, and the Pacific Islands. We analyzed patterns of genetic variation using Bayesian inference, maximum likelihood phylogeny, populations splits and mixtures measuring correlations in allele frequencies and genetic drift, and finally applied coalescent based theory using approximate Bayesian computation (ABC) with supervised machine learning. Population structure analyses revealed five genetic groups with signatures of complex recent and ancient migration histories. The most probable scenario of ancient pathogen spread is movement from west to east: from Malaysia to the Pacific Islands, with subsequent spread to Taiwan and Australia. Furthermore, ABC analyses indicate that P. noxius spread occurred thousands of generations ago, contradicting previous assumptions that it was recently introduced in multiple areas. Our results suggest that recent emergence of P. noxius in east Asia, Australia, and the Pacific Islands is likely driven by anthropogenic and natural disturbances, including deforestation, land-use change, severe weather events, and introduction of exotic plants. This study provides a novel example of utilization of genome wide allele frequency data to unravel dynamics of pathogen emergence under conditions of changing ecosystems.


Author(s):  
Dave Lutgen ◽  
Raphael Ritter ◽  
Remi-André Olsen ◽  
Holger Schielzeth ◽  
Joel Gruselius ◽  
...  

AbstractThe feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and the inference of selective sweeps – are still limited by the lack of high-quality haplotype information. In this respect, the newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of the phased sequence located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90), respectively. Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Finally, phasing contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing data at population scale.


2018 ◽  
Author(s):  
Danièle Filiault ◽  
Evangeline S. Ballerini ◽  
Terezie Mandáková ◽  
Gökçe Aköz ◽  
Nathan Derieg ◽  
...  

AbstractThe columbine genus Aquilegia is a classic example of an adaptive radiation, involving a wide variety of pollinators and habitats. Here we present the genome assembly of A. coerulea ‘Goldsmith’, complemented by high-coverage sequencing data from 10 wild species covering the world-wide distribution. Our analyses reveal extensive allele sharing among species and demonstrate that introgression and selection played a role in the Aquilegia radiation. We also present the remarkable discovery that the evolutionary history of an entire chromosome differs from that of the rest of the genome – a phenomenon which we do not fully understand, but which highlights the need to consider chromosomes in an evolutionary context.


Author(s):  
Olga Kozhar ◽  
Mee-Sook Kim ◽  
Jorge Ibarra Caballero ◽  
Ned Klopfenstein ◽  
Phil Cannon ◽  
...  

Emerging plant pathogens have been increasing exponentially over the last century. To address this issue, it is critical to determine whether these pathogens are native to ecosystems or have been recently introduced. Understanding the ecological and evolutionary processes fostering emergence can help to manage their spread and predict epidemics/epiphytotics. Using restriction site-associated DNA sequencing data, we studied genetic relationships, pathways of spread, and evolutionary history of Phellinus noxius, an emerging root-rotting fungus of unknown origin, in eastern Asia, Australia, and the Pacific Islands. We analyzed patterns of genetic variation using Bayesian inference, maximum likelihood phylogeny, populations splits and mixtures measuring correlations in allele frequencies and genetic drift, and finally applied coalescent based theory using Approximate Bayesian computation (ABC) with supervised machine learning. Population structure analyses revealed five genetic groups with signatures of complex recent and ancient migration histories. The most probable scenario of ancient pathogen spread is movement from ghost population to Malaysia and the Pacific Islands, with subsequent spread to Taiwan and Australia. Furthermore, ABC analyses indicate that P. noxius spread occurred thousands of generations ago, contradicting previous assumptions that this pathogen was recently introduced to multiple geographic regions. Our results suggest that recent emergence of P. noxius in eastern Asia, Australia, and the Pacific Islands is likely driven by anthropogenic and natural disturbances, such as deforestation, land-use change, severe weather events, and/or introduction of exotic plants. This study provides a novel example of applying genome-wide allele frequency data to unravel dynamics of pathogen emergence under changing ecosystem conditions.


2017 ◽  
Author(s):  
Ryan M. Moore ◽  
Amelia O. Harrison ◽  
Sean M. McAllister ◽  
Shawn W. Polson ◽  
K. Eric Wommack

ABSTRACTPhylogenetic trees are an important analytical tool for evaluating community diversity and evolutionary history. In the case of microorganisms, the decreasing cost of sequencing has enabled researchers to generate ever-larger sequence datasets, which in turn have begun to fill gaps in the evolutionary history of microbial groups. However, phylogenetic analyses of these types of datasets create complex trees that can be challenging to interpret. Scientific inferences made by visual inspection of phylogenetic trees can be simplified and enhanced by customizing various parts of the tree. Yet, manual customization is time-consuming and error prone, and programs designed to assist in batch tree customization often require programming experience or complicated file formats for annotation. Iroki, a user-friendly web interface for tree visualization, addresses these issues by providing automatic customization of large trees based on metadata contained in tab-separated text files. Iroki’s utility for exploring biological and ecological trends in sequencing data was demonstrated through a variety of microbial ecology applications in which trees with hundreds to thousands of leaf nodes were customized according to extensive collections of metadata. The Iroki web application and documentation are available at https://www.iroki.net or through the VIROME portal (http://virome.dbi.udel.edu). Iroki’s source code is released under the MIT license and is available at https://github.com/mooreryan/iroki.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8584 ◽  
Author(s):  
Ryan M. Moore ◽  
Amelia O. Harrison ◽  
Sean M. McAllister ◽  
Shawn W. Polson ◽  
K. Eric Wommack

Phylogenetic trees are an important analytical tool for evaluating community diversity and evolutionary history. In the case of microorganisms, the decreasing cost of sequencing has enabled researchers to generate ever-larger sequence datasets, which in turn have begun to fill gaps in the evolutionary history of microbial groups. However, phylogenetic analyses of these types of datasets create complex trees that can be challenging to interpret. Scientific inferences made by visual inspection of phylogenetic trees can be simplified and enhanced by customizing various parts of the tree. Yet, manual customization is time-consuming and error prone, and programs designed to assist in batch tree customization often require programming experience or complicated file formats for annotation. Iroki, a user-friendly web interface for tree visualization, addresses these issues by providing automatic customization of large trees based on metadata contained in tab-separated text files. Iroki’s utility for exploring biological and ecological trends in sequencing data was demonstrated through a variety of microbial ecology applications in which trees with hundreds to thousands of leaf nodes were customized according to extensive collections of metadata. The Iroki web application and documentation are available at https://www.iroki.net or through the VIROME portal http://virome.dbi.udel.edu. Iroki’s source code is released under the MIT license and is available at https://github.com/mooreryan/iroki.


Author(s):  
Salem Malikić ◽  
Farid Rashidi Mehrabadi ◽  
Erfan Sadeqi Azer ◽  
Mohammad Haghir Ebrahimabadi ◽  
S. Cenk Sahinalp

AbstractSingle-cell sequencing data has great potential in reconstructing the evolutionary history of tumors. Rapid advances in single-cell sequencing technology in the past decade were followed by the design of various computational methods for inferring trees of tumor evolution. Some of the earliest of these methods were based on the direct search in the space of trees. However, it can be shown that instead of this tree search strategy we can perform a search in the space of binary matrices and obtain the most likely tree directly from the most likely among the candidate binary matrices. The search in the space of binary matrices can be expressed as an instance of integer linear or constraint satisfaction programming and solved by some of the available solvers, which typically provide a guarantee of optimality of the reported solution. In this review, we first describe one convenient tree representation of tumor evolutionary history and present tree scoring model that is most commonly used in the available methods. We then provide proof showing that the most likely tree of tumor evolution can be obtained directly from the most likely matrix from the space of candidate binary matrices. Next, we provide integer linear programming formulation to search for such matrix and summarize the existing methods based on this formulation or its extensions. Lastly, we present one use-case which illustrates how binary matrices can be used as a basis for developing a fast deep learning method for inferring some topological properties of the most likely tree of tumor evolution.


2020 ◽  
Vol 12 (3) ◽  
pp. 66-76 ◽  
Author(s):  
Wencheng Zong ◽  
Bo Gao ◽  
Mohamed Diaby ◽  
Dan Shen ◽  
Saisai Wang ◽  
...  

Abstract The discovery of new members of the Tc1/mariner superfamily of transposons is expected based on the increasing availability of genome sequencing data. Here, we identified a new DD35E family termed Traveler (TR). Phylogenetic analyses of its DDE domain and full-length transposase showed that, although TR formed a monophyletic clade, it exhibited the highest sequence identity and closest phylogenetic relationship with DD34E/Tc1. This family displayed a very restricted taxonomic distribution in the animal kingdom and was only detected in ray-finned fish, anura, and squamata, including 91 vertebrate species. The structural organization of TRs was highly conserved across different classes of animals. Most intact TR transposons had a length of ∼1.5 kb (range 1,072–2,191 bp) and harbored a single open reading frame encoding a transposase of ∼340 aa (range 304–350 aa) flanked by two short-terminal inverted repeats (13–68 bp). Several conserved motifs, including two helix-turn-helix motifs, a GRPR motif, a nuclear localization sequence, and a DDE domain, were also identified in TR transposases. This study also demonstrated the presence of horizontal transfer events of TRs in vertebrates, whereas the average sequence identities and the evolutionary dynamics of TR elements across species and clusters strongly indicated that the TR family invaded the vertebrate lineage very recently and that some of these elements may be currently active, combining the intact TR copies in multiple lineages of vertebrates. These data will contribute to the understanding of the evolutionary history of Tc1/mariner transposons and that of their hosts.


Sign in / Sign up

Export Citation Format

Share Document