Synthesis of phylogeny and taxonomy into a comprehensive tree of life

Mapping Intimacies ◽

10.1101/012260 ◽

2014 ◽

Cited By ~ 2

Author(s):

Cody Hinchliff ◽

Stephen A Smith ◽

James F Allman ◽

J Gordon Burleigh ◽

Ruchi Chaudhary ◽

...

Keyword(s):

Phylogenetic Trees ◽

Biological Diversity ◽

Phylogenetic Reconstruction ◽

Tree Of Life ◽

Grand Challenge ◽

Community Resources ◽

Starting Point ◽

Community Contribution ◽

Fundamental Research ◽

Digital Objects

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips -- the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: 1) a novel comprehensive global reference taxonomy; and 2) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. While data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.

Download Full-text

Synthesis of phylogeny and taxonomy into a comprehensive tree of life

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1423041112 ◽

2015 ◽

Vol 112 (41) ◽

pp. 12764-12769 ◽

Cited By ~ 340

Author(s):

Cody E. Hinchliff ◽

Stephen A. Smith ◽

James F. Allman ◽

J. Gordon Burleigh ◽

Ruchi Chaudhary ◽

...

Keyword(s):

Phylogenetic Trees ◽

Biological Diversity ◽

Phylogenetic Reconstruction ◽

Tree Of Life ◽

Grand Challenge ◽

Community Resources ◽

Starting Point ◽

Community Contribution ◽

Fundamental Research ◽

Digital Objects

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips—the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.

Download Full-text

Physcraper: a Python package for continually updated phylogenetic trees using the Open Tree of Life

BMC Bioinformatics ◽

10.1186/s12859-021-04274-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Luna L. Sánchez-Reyes ◽

Martha Kandziora ◽

Emily Jane McTavish

Keyword(s):

Dna Sequences ◽

Phylogenetic Trees ◽

Phylogenetic Reconstruction ◽

Open Science ◽

Tree Of Life ◽

Matrix Assembly ◽

Life Project ◽

Character Matrix ◽

Molecular Dataset ◽

Phylogenetic Hypotheses

Abstract Background Phylogenies are a key part of research in many areas of biology. Tools that automate some parts of the process of phylogenetic reconstruction, mainly molecular character matrix assembly, have been developed for the advantage of both specialists in the field of phylogenetics and non-specialists. However, interpretation of results, comparison with previously available phylogenetic hypotheses, and selection of one phylogeny for downstream analyses and discussion still impose difficulties to one that is not a specialist either on phylogenetic methods or on a particular group of study. Results Physcraper is a command-line Python program that automates the update of published phylogenies by adding public DNA sequences to underlying alignments of previously published phylogenies. It also provides a framework for straightforward comparison of published phylogenies with their updated versions, by leveraging upon tools from the Open Tree of Life project to link taxonomic information across databases. The program can be used by the nonspecialist, as a tool to generate phylogenetic hypotheses based on publicly available expert phylogenetic knowledge. Phylogeneticists and taxonomic group specialists will find it useful as a tool to facilitate molecular dataset gathering and comparison of alternative phylogenetic hypotheses (topologies). Conclusion The Physcraper workflow showcases the benefits of doing open science for phylogenetics, encouraging researchers to strive for better scientific sharing practices. Physcraper can be used with any OS and is released under an open-source license. Detailed instructions for installation and usage are available at https://physcraper.readthedocs.io.

Download Full-text

Phylogenetic Signal, Congruence, and Uncertainty across Bacteria and Archaea

Molecular Biology and Evolution ◽

10.1093/molbev/msab254 ◽

2021 ◽

Author(s):

Carolina A Martinez-Gutierrez ◽

Frank O Aylward

Keyword(s):

Phylogenetic Trees ◽

Phylogenetic Signal ◽

Phylogenetic Reconstruction ◽

Sister Group ◽

Tree Of Life ◽

Marker Genes ◽

Sequence Composition ◽

Tree Construction ◽

Taxonomic Groups ◽

The Impact

Abstract Reconstruction of the Tree of Life is a central goal in biology. Although numerous novel phyla of bacteria and archaea have recently been discovered, inconsistent phylogenetic relationships are routinely reported, and many inter-phylum and inter-domain evolutionary relationships remain unclear. Here, we benchmark different marker genes often used in constructing multidomain phylogenetic trees of bacteria and archaea and present a set of marker genes that perform best for multidomain trees constructed from concatenated alignments. We use recently-developed Tree Certainty metrics to assess the confidence of our results and to obviate the complications of traditional bootstrap-based metrics. Given the vastly disparate number of genomes available for different phyla of bacteria and archaea, we also assessed the impact of taxon sampling on multidomain tree construction. Our results demonstrate that biases between the representation of different taxonomic groups can dramatically impact the topology of resulting trees. Inspection of our highest-quality tree supports the division of most bacteria into Terrabacteria and Gracilicutes, with Thermatogota and Synergistota branching earlier from these superphyla. This tree also supports the inclusion of the Patescibacteria within the Terrabacteria as a sister group to the Chloroflexota instead of as a basal-branching lineage. For the Archaea, our tree supports three monophyletic lineages (DPANN, Euryarchaeota, and TACK/Asgard), although we note the basal placement of the DPANN may still represent an artifact caused by biased sequence composition. Our findings provide a robust and standardized framework for multidomain phylogenetic reconstruction that can be used to evaluate inter-phylum relationships and assess uncertainty in conflicting topologies of the Tree of Life.

Download Full-text

Beyond Phylogeny Reconstruction—Tree-Based Analyses in Paleontology: Foreword

Paleobiology ◽

10.1666/0094-8373(2001)027<0187:bprtba>2.0.co;2 ◽

2001 ◽

Vol 27 (2) ◽

pp. 187-187

Author(s):

Lisa Park ◽

Andrew. B. Smith

Keyword(s):

Statistical Inference ◽

Phylogenetic Trees ◽

Phylogenetic Reconstruction ◽

Phylogeny Reconstruction ◽

Exact Methods ◽

The Past ◽

Tree Construction ◽

Starting Point ◽

Wide Range ◽

Phylogenetic Hypotheses

The reconstruction of phylogenies using cladistic methods is a powerful and well-established tool for evolutionary biologists and paleobiologists. Indeed, the construction of rigorous phylogenetic hypotheses has become widely accepted as an essential first step in the analysis of historical patterns for both extant and extinct organisms. In the past few years, there has arisen a healthy and constructive debate as to the exact methods that will lead to the most accurate tree (for example whether statistical inference or stratigraphic information has any part to play in phylogenetic reconstruction). Although important, this debate has tended to focus on the problems of tree construction and divert attention away from the applications of tree-based research. The construction of a phylogeny is, after all, only a first step, and phylogenetic trees provide the starting point from which to address a wide range of interesting biological and geological topics.

Download Full-text

Discovery of an Unexpected Similarity in Ligand Binding Between BRD4 and PPARγ

10.26434/chemrxiv.11472618.v1 ◽

2019 ◽

Author(s):

Lina Humbeck ◽

Jette Pretzel ◽

Saskia Spitzer ◽

Oliver Koch

Keyword(s):

Cancer Therapy ◽

Drug Targets ◽

Synergistic Effects ◽

Complex Structure ◽

Peroxisome Proliferator ◽

Resistance Development ◽

Peroxisome Proliferator Activated Receptor ◽

Starting Point ◽

Fundamental Research ◽

Important Drug

Knowledge about interrelationships between different proteins is crucial in fundamental research for the elucidation of protein networks and pathways. Furthermore, it is especially critical in chemical biology to identify further key regulators of a disease and to take advantage of polypharmacology effects. A comprehensive scaffold-based analysis uncovered an unexpected relationship between bromodomain-containing protein 4 (BRD4) and peroxisome-proliferator activated receptor gamma (PPARγ). They are both important drug targets for cancer therapy and many more important diseases. Both proteins share binding site similarities near a common hydrophobic subpocket which should allow the design of a polypharmacology-based ligand targeting both proteins. Such a dual-BRD4-PPARγ-modulator could show synergistic effects with a higher efficacy or delayed resistance development in, for example, cancer therapy. Thereon, a complex structure of sulfasalazine was obtained that involves two bromodomains and could be a potential starting point for the design of a bivalent BRD4 inhibitor.

Download Full-text

Genome-scale reconstructions to assess metabolic phylogeny and organism clustering

PLoS ONE ◽

10.1371/journal.pone.0240953 ◽

2020 ◽

Vol 15 (12) ◽

pp. e0240953

Author(s):

Christian Schulz ◽

Eivind Almaas

Keyword(s):

Phylogenetic Trees ◽

Metabolic Networks ◽

Sulfur Metabolism ◽

Phylogenetic Analyses ◽

Tree Of Life ◽

Significant Heterogeneity ◽

Metabolic Reaction ◽

High Quality ◽

Conserved Genes ◽

Genome Scale

Approaches for systematizing information of relatedness between organisms is important in biology. Phylogenetic analyses based on sets of highly conserved genes are currently the basis for the Tree of Life. Genome-scale metabolic reconstructions contain high-quality information regarding the metabolic capability of an organism and are typically restricted to metabolically active enzyme-encoding genes. While there are many tools available to generate draft reconstructions, expert-level knowledge is still required to generate and manually curate high-quality genome-scale metabolic models and to fill gaps in their reaction networks. Here, we use the tool AutoKEGGRec to construct 975 genome-scale metabolic draft reconstructions encoded in the KEGG database without further curation. The organisms are selected across all three domains, and their metabolic networks serve as basis for generating phylogenetic trees. We find that using all reactions encoded, these metabolism-based comparisons give rise to a phylogenetic tree with close similarity to the Tree of Life. While this tree is quite robust to reasonable levels of noise in the metabolic reaction content of an organism, we find a significant heterogeneity in how much noise an organism may tolerate before it is incorrectly placed in the tree. Furthermore, by using the protein sequences for particular metabolic functions and pathway sets, such as central carbon-, nitrogen-, and sulfur-metabolism, as basis for the organism comparisons, we generate highly specific phylogenetic trees. We believe the generation of phylogenetic trees based on metabolic reaction content, in particular when focused on specific functions and pathways, could aid the identification of functionally important metabolic enzymes and be of value for genome-scale metabolic modellers and enzyme-engineers.

Download Full-text

Identification key for anuran amphibians in a protected area in the northeastern Atlantic Forest

Papéis Avulsos de Zoologia ◽

10.11606/1807-0205/2021.61.76 ◽

2021 ◽

Vol 61 ◽

pp. e20216176

Author(s):

Marcos Jorge Matias Dubeux ◽

Filipe Augusto Cavalcanti do Nascimento ◽

Ubiratan Gonçalves ◽

Tamí Mott

Keyword(s):

Atlantic Forest ◽

Biological Diversity ◽

Morphological Characteristics ◽

Identification Key ◽

Anuran Amphibians ◽

Anuran Species ◽

Starting Point ◽

Environmental Protection Area ◽

Baseline Information ◽

Northeastern Atlantic

The identification of anuran amphibians is still a challenge in megadiverse assemblages. In the Neotropics, the Atlantic Forest harbors more than 600 anuran species, and many studies in this ecoregion report anuran assemblages surpassing 30 species. Taxonomic keys facilitate the identification of biological diversity, however only a few are available for anuran assemblages in the Atlantic Forest. Herein we present an identification key for 40 anuran species distributed across 20 genera and nine families, occurring in the Environmental Protection Area of Catolé and Fernão Velho, northeastern Atlantic Forest. Thirty-five morphological characteristics were used in the key, all of which can be easily observed in living and museum specimens. This pioneer study provides the first identification key for an amphibian assemblage in the northeastern Atlantic Forest and this baseline information acts as the starting point for the development of evolutionary and ecological research in this conservation unit.

Download Full-text

The Geographic Structure of Viruses in the Cuatro Ciénegas Basin, a Unique Oasis in Northern Mexico, Reveals a Highly Diverse Population on a Small Geographic Scale

Applied and Environmental Microbiology ◽

10.1128/aem.00465-18 ◽

2018 ◽

Vol 84 (11) ◽

Cited By ~ 17

Author(s):

B. Taboada ◽

P. Isa ◽

A. L. Gutiérrez-Escolano ◽

R. M. del Ángel ◽

J. E. Ludert ◽

...

Keyword(s):

Fish Species ◽

Phylogenetic Trees ◽

Biological Diversity ◽

Chihuahuan Desert ◽

Geographic Scale ◽

Geographic Structure ◽

Virus Diversity ◽

Intestinal Contents ◽

Cuatro Ciénegas ◽

Cuatro Ciénegas Basin

ABSTRACT The Cuatro Ciénegas Basin (CCB) is located in the Chihuahuan desert in the Mexican state of Coahuila; it has been characterized as a site with high biological diversity despite its extreme oligotrophic conditions. It has the greatest number of endemic species in North America, containing abundant living microbialites (including stromatolites and microbial mats) and diverse microbial communities. With the hypothesis that this high biodiversity and the geographic structure should be reflected in the virome, the viral communities in 11 different locations of three drainage systems, Churince, La Becerra, and Pozas Rojas, and in the intestinal contents of 3 different fish species, were analyzed for both eukaryotic and prokaryotic RNA and DNA viruses using next-generation sequencing methods. Double-stranded DNA (dsDNA) virus families were the most abundant (72.5% of reads), followed by single-stranded DNA (ssDNA) viruses (2.9%) and ssRNA and dsRNA virus families (0.5%). Thirteen families had dsDNA genomes, five had ssDNA, three had dsRNA, and 16 had ssRNA. A highly diverse viral community was found, with an ample range of hosts and a strong geographical structure, with very even distributions and signals of endemicity in the phylogenetic trees from several different virus families. The majority of viruses found were bacteriophages but eukaryotic viruses were also frequent, and the large diversity of viruses related to algae were a surprise, since algae are not evident in the previously analyzed aquatic systems of this ecosystem. Animal viruses were also frequently found, showing the large diversity of aquatic animals in this oasis, where plants, protozoa, and archaea are rare. IMPORTANCE In this study, we tested whether the high biodiversity and geographic structure of CCB is reflected in its virome. CCB is an extraordinarily biodiverse oasis in the Chihuahuan desert, where a previous virome study suggested that viruses had followed the marine ancestry of the marine bacteria and, as a result of their long isolation, became endemic to the site. In this study, which includes a larger sequencing coverage and water samples from other sites within the valley, we confirmed the high virus biodiversity and uniqueness as well as the strong biogeographical diversification of the CCB. In addition, we also analyzed fish intestinal contents, finding that each fish species eats different prey and, as a result, presents different viral compositions even if they coexist in the same pond. These facts highlight the high and novel virus diversity of CCB and its “lost world” status.

Download Full-text

Can the Cambrian explosion be inferred through molecular phylogeny?

Development ◽

10.1242/dev.1994.supplement.15 ◽

1994 ◽

Vol 1994 (Supplement) ◽

pp. 15-25

Author(s):

Hervé Philippe ◽

Anne Chenuil ◽

André Adoutte

Keyword(s):

Phylogenetic Trees ◽

18S Rrna ◽

Phylogenetic Reconstruction ◽

Cambrian Explosion ◽

Bootstrap Support ◽

Time Interval ◽

Short Time Interval ◽

Data Set ◽

Molecular Phylogenetic ◽

Animal Phyla

Most of the major invertebrate phyla appear in the fossil record during a relatively short time interval, not exceeding 20 million years (Myr), 540-520 Myr ago. This rapid diversification is known as the `Cambrian explosion'. In the present paper, we ask whether molecular phylogenetic reconstruction provides confirmation for such an evolutionary burst. The expectation is that the molecular phylogenetic trees should take the form of a large unresolved multifurcation of the various animal lineages. Complete 18S rRNA sequences of 69 extant representatives of 15 animal phyla were obtained from data banks. After eliminating a major source of artefact leading to lack of resolution in phylogenetic trees (mutational saturation of sequences), we indeed observe that the major lines of triploblast coelomates (arthropods, molluscs, echinoderms, chordates...) are very poorly resolved i.e. the nodes defining the various clades are not supported by high bootstrap values. Using a previously developed procedure consisting of calculating bootstrap proportions of each node of the tree as a function of increasing amount of nucleotides (Lecointre, G., Philippe, H. Le, H. L. V. and Le Guyader, H. (1994) Mol. Phyl. Evol., in press) we obtain a more informative indication of the robustness of each node. In addition, this procedure allows us to estimate the number of additional nucleotides that would be required to resolve confidently the currently uncertain nodes; this number turns out to be extremely high and experimentally unfeasible. We then take this approach one step further: using parameters derived from the above analysis, assuming a molecular clock and using palaeontological dates for calibration, we establish a relationship between the number of sites contained in a given data set and the time interval that this data set can confidently resolve (with 95% bootstrap support). Under these assumptions, the presently available 18S rRNA database cannot confidently resolve cladogenetic events separated by less than about 40 Myr. Thus, at the present time, the potential resolution by the palaeontological approach is higher than that by the molecular one.

Download Full-text

New Insights on the Evolution of the Sweet Taste Receptor of Primates Adapted to Harsh Environments

Animals ◽

10.3390/ani10122359 ◽

2020 ◽

Vol 10 (12) ◽

pp. 2359

Author(s):

Nur Aida Md Tamrin ◽

Ramlah Zainudin ◽

Yuzine Esa ◽

Halimah Alias ◽

Mohd Noor Mat Isa ◽

...

Keyword(s):

Phylogenetic Trees ◽

Environmental Changes ◽

Receptor Gene ◽

Taste Receptor ◽

Sensory Information ◽

Sweet Taste ◽

Taste Perception ◽

Sweet Taste Receptor ◽

Dietary Preferences ◽

Starting Point

Taste perception is an essential function that provides valuable dietary and sensory information, which is crucial for the survival of animals. Studies into the evolution of the sweet taste receptor gene (TAS1R2) are scarce, especially for Bornean endemic primates such as Nasalis larvatus (proboscis monkey), Pongo pygmaeus (Bornean orangutan), and Hylobates muelleri (Muller’s Bornean gibbon). Primates are the perfect taxa to study as they are diverse dietary feeders, comprising specialist folivores, frugivores, gummivores, herbivores, and omnivores. We constructed phylogenetic trees of the TAS1R2 gene for 20 species of anthropoid primates using four different methods (neighbor-joining, maximum parsimony, maximum-likelihood, and Bayesian) and also established the time divergence of the phylogeny. The phylogeny successfully separated the primates into their taxonomic groups as well as by their dietary preferences. Of note, the reviewed time of divergence estimation for the primate speciation pattern in this study was more recent than the previously published estimates. It is believed that this difference may be due to environmental changes, such as food scarcity and climate change, during the late Miocene epoch, which forced primates to change their dietary preferences. These findings provide a starting point for further investigation.

Download Full-text