scholarly journals Supertrees based on the subtree prune-and-regraft distance

Author(s):  
Chris Whidden ◽  
Norbert Zeh ◽  
Robert G Beiko

Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Although they often work well in practice, existing supertree approaches use optimality criteria that do not reflect underlying processes, have known biases and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and > 50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of simulated benchmark datasets, we show that SPR supertrees are more similar to correct species histories under plausible rates of LGT than supertrees based on parsimony or Robinson-Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera; a small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT.

2013 ◽  
Author(s):  
Chris Whidden ◽  
Norbert Zeh ◽  
Robert G Beiko

Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Although they often work well in practice, existing supertree approaches use optimality criteria that do not reflect underlying processes, have known biases and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and > 50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of simulated benchmark datasets, we show that SPR supertrees are more similar to correct species histories under plausible rates of LGT than supertrees based on parsimony or Robinson-Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera; a small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT.


2009 ◽  
Vol 364 (1527) ◽  
pp. 2229-2239 ◽  
Author(s):  
Gregory P. Fournier ◽  
Jinling Huang ◽  
J. Peter Gogarten

Horizontal gene transfer (HGT) is often considered to be a source of error in phylogenetic reconstruction, causing individual gene trees within an organismal lineage to be incongruent, obfuscating the ‘true’ evolutionary history. However, when identified as such, HGTs between divergent organismal lineages are useful, phylogenetically informative characters that can provide insight into evolutionary history. Here, we discuss several distinct HGT events involving all three domains of life, illustrating the selective advantages that can be conveyed via HGT, and the utility of HGT in aiding phylogenetic reconstruction and in dating the relative sequence of speciation events. We also discuss the role of HGT from extinct lineages, and its impact on our understanding of the evolution of life on Earth. Organismal phylogeny needs to incorporate reticulations; a simple tree does not provide an accurate depiction of the processes that have shaped life's history.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 1805 ◽  
Author(s):  
Eugene V. Koonin

The wide spread of gene exchange and loss in the prokaryotic world has prompted the concept of ‘lateral genomics’ to the point of an outright denial of the relevance of phylogenetic trees for evolution. However, the pronounced coherence congruence of the topologies of numerous gene trees, particularly those for (nearly) universal genes, translates into the notion of a statistical tree of life (STOL), which reflects a central trend of vertical evolution. The STOL can be employed as a framework for reconstruction of the evolutionary processes in the prokaryotic world. Quantitatively, however, horizontal gene transfer (HGT) dominates microbial evolution, with the rate of gene gain and loss being comparable to the rate of point mutations and much greater than the duplication rate. Theoretical models of evolution suggest that HGT is essential for the survival of microbial populations that otherwise deteriorate due to the Muller’s ratchet effect. Apparently, at least some bacteria and archaea evolved dedicated vehicles for gene transfer that evolved from selfish elements such as plasmids and viruses. Recent phylogenomic analyses suggest that episodes of massive HGT were pivotal for the emergence of major groups of organisms such as multiple archaeal phyla as well as eukaryotes. Similar analyses appear to indicate that, in addition to donating hundreds of genes to the emerging eukaryotic lineage, mitochondrial endosymbiosis severely curtailed HGT. These results shed new light on the routes of evolutionary transitions, but caution is due given the inherent uncertainty of deep phylogenies.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Koji Takayama ◽  
Yoichi Tateishi ◽  
Tadashi Kajita

AbstractRhizophora is a key genus for revealing the formation process of the pantropical distribution of mangroves. In this study, in order to fully understand the historical scenario of Rhizophora that achieved pantropical distribution, we conducted phylogeographic analyses based on nucleotide sequences of chloroplast and nuclear DNA as well as microsatellites for samples collected worldwide. Phylogenetic trees suggested the monophyly of each AEP and IWP lineages respectively except for R. samoensis and R. × selala. The divergence time between the two lineages was 10.6 million years ago on a dated phylogeny, and biogeographic stochastic mapping analyses supported these lineages separated following a vicariant event. These data suggested that the closure of the Tethys Seaway and the reduction in mangrove distribution followed by Mid-Miocene cooling were key factors that caused the linage diversification. Phylogeographic analyses also suggested the formation of the distinctive genetic structure at the AEP region across the American continents around Pliocene. Furthermore, long-distance trans-pacific dispersal occurred from the Pacific coast of American continents to the South Pacific and formed F1 hybrid, resulting in gene exchange between the IWP and AEP lineages after 11 million years of isolation. Considering the phylogeny and phylogeography with divergence time, a comprehensive picture of the historical scenario behind the pantropical distribution of Rhizophora is updated.


2021 ◽  
Vol 82 (1-2) ◽  
Author(s):  
Lena Collienne ◽  
Alex Gavryushkin

AbstractMany popular algorithms for searching the space of leaf-labelled (phylogenetic) trees are based on tree rearrangement operations. Under any such operation, the problem is reduced to searching a graph where vertices are trees and (undirected) edges are given by pairs of trees connected by one rearrangement operation (sometimes called a move). Most popular are the classical nearest neighbour interchange, subtree prune and regraft, and tree bisection and reconnection moves. The problem of computing distances, however, is $${\mathbf {N}}{\mathbf {P}}$$ N P -hard in each of these graphs, making tree inference and comparison algorithms challenging to design in practice. Although anked phylogenetic trees are one of the central objects of interest in applications such as cancer research, immunology, and epidemiology, the computational complexity of the shortest path problem for these trees remained unsolved for decades. In this paper, we settle this problem for the ranked nearest neighbour interchange operation by establishing that the complexity depends on the weight difference between the two types of tree rearrangements (rank moves and edge moves), and varies from quadratic, which is the lowest possible complexity for this problem, to $${\mathbf {N}}{\mathbf {P}}$$ N P -hard, which is the highest. In particular, our result provides the first example of a phylogenetic tree rearrangement operation for which shortest paths, and hence the distance, can be computed efficiently. Specifically, our algorithm scales to trees with tens of thousands of leaves (and likely hundreds of thousands if implemented efficiently).


2013 ◽  
Vol 51 ◽  
pp. 202-213 ◽  
Author(s):  
Vincent Pernet ◽  
Sandrine Joly ◽  
Deniz Dalkara ◽  
Noémie Jordi ◽  
Olivia Schwarz ◽  
...  

2021 ◽  
Author(s):  
Jonathan Filee ◽  
Hubert J. Becker ◽  
Lucille Mellottee ◽  
Zhihui LI ◽  
Jean-Christophe Lambry ◽  
...  

Little is known about the evolution and biosynthetic function of DNA precursor and the folate metabolism in the Asgard group of archaea. As Asgard occupy a key position in the archaeal and eukaryotic phylogenetic trees, we have exploited very recently emerged genome and metagenome sequence information to investigate these central metabolic pathways. Our genome-wide analyses revealed that the recently cultured Asgard archaeon Candidatus Prometheoarchaeum syntrophicum strain MK-D1 (Psyn) contains a complete folate-dependent network for the biosynthesis of DNA/RNA precursors, amino acids and syntrophic amino acid utilization. Altogether our experimental and computational data suggest that phylogenetic incongruences of functional folate-dependent enzymes from Asgard archaea reflect their persistent horizontal transmission from various bacterial groups, which has rewired the key metabolic reactions in an important and recently identified archaeal phylogenetic group. We also experimentally validated the functionality of the lateral gene transfer of Psyn thymidylate synthase ThyX. This enzyme uses bacterial-like folates efficiently and is inhibited by mycobacterial ThyX inhibitors. Our data raise the possibility that the thymidylate metabolism, required for de novo DNA synthesis, originated in bacteria and has been independently transferred to archaea and eukaryotes. In conclusion, our study has revealed that recent prevalent lateral gene transfer has markedly shaped the evolution of Asgard archaea by allowing them to adapt to specific ecological niches.


2014 ◽  
Vol 80 (17) ◽  
pp. 5503-5514 ◽  
Author(s):  
Christophe Habib ◽  
Armel Houel ◽  
Aurélie Lunazzi ◽  
Jean-François Bernardet ◽  
Anne Berit Olsen ◽  
...  

ABSTRACTThe genusTenacibaculum, a member of the familyFlavobacteriaceae, is an abundant component of marine bacterial ecosystems that also hosts several fish pathogens, some of which are of serious concern for marine aquaculture. Here, we applied multilocus sequence analysis (MLSA) to 114 representatives of most known species in the genus and of the worldwide diversity of the major fish pathogenTenacibaculum maritimum. Recombination hampers precise phylogenetic reconstruction, but the data indicate intertwined environmental and pathogenic lineages, which suggests that pathogenicity evolved independently in several species. At lower phylogenetic levels recombination is also important, and the speciesT. maritimumconstitutes a cohesive group of isolates. Importantly, the data reveal no trace of long-distance dissemination that could be linked to international fish movements. Instead, the high number of distinct genotypes suggests an endemic distribution of strains. The MLSA scheme and the data described in this study will help in monitoringTenacibaculuminfections in marine aquaculture; we show, for instance, that isolates from tenacibaculosis outbreaks in Norwegian salmon farms are related toT. dicentrarchi, a recently described species.


Development ◽  
1994 ◽  
Vol 1994 (Supplement) ◽  
pp. 15-25
Author(s):  
Hervé Philippe ◽  
Anne Chenuil ◽  
André Adoutte

Most of the major invertebrate phyla appear in the fossil record during a relatively short time interval, not exceeding 20 million years (Myr), 540-520 Myr ago. This rapid diversification is known as the `Cambrian explosion'. In the present paper, we ask whether molecular phylogenetic reconstruction provides confirmation for such an evolutionary burst. The expectation is that the molecular phylogenetic trees should take the form of a large unresolved multifurcation of the various animal lineages. Complete 18S rRNA sequences of 69 extant representatives of 15 animal phyla were obtained from data banks. After eliminating a major source of artefact leading to lack of resolution in phylogenetic trees (mutational saturation of sequences), we indeed observe that the major lines of triploblast coelomates (arthropods, molluscs, echinoderms, chordates...) are very poorly resolved i.e. the nodes defining the various clades are not supported by high bootstrap values. Using a previously developed procedure consisting of calculating bootstrap proportions of each node of the tree as a function of increasing amount of nucleotides (Lecointre, G., Philippe, H. Le, H. L. V. and Le Guyader, H. (1994) Mol. Phyl. Evol., in press) we obtain a more informative indication of the robustness of each node. In addition, this procedure allows us to estimate the number of additional nucleotides that would be required to resolve confidently the currently uncertain nodes; this number turns out to be extremely high and experimentally unfeasible. We then take this approach one step further: using parameters derived from the above analysis, assuming a molecular clock and using palaeontological dates for calibration, we establish a relationship between the number of sites contained in a given data set and the time interval that this data set can confidently resolve (with 95% bootstrap support). Under these assumptions, the presently available 18S rRNA database cannot confidently resolve cladogenetic events separated by less than about 40 Myr. Thus, at the present time, the potential resolution by the palaeontological approach is higher than that by the molecular one.


Sign in / Sign up

Export Citation Format

Share Document