supertree methods Latest Research Papers

Cladistic hypotheses as degree of equivalence relational structures: implications for three-item statements

10.1101/2021.01.14.426769 ◽

2021 ◽

Author(s):

Valentin Rineau ◽

Stéphane Prin

Keyword(s):

Phylogenetic Trees ◽

Phylogenetic Reconstruction ◽

Equivalence Relations ◽

Strong Connection ◽

Relational Structures ◽

Degree Of Equivalence ◽

Axiomatic Definition ◽

Finite Set ◽

Definition Of ◽

Supertree Methods

AbstractThree-item statements, as minimal informative rooted binary phylogenetic trees on three items, are the minimal units of cladistic information. Their importance for phylogenetic reconstruction, consensus and supertree methods relies on both (i) the fact that any cladistic tree can always be decomposed into a set of three-item statements, and (ii) the possibility, at least under some conditions, to build a new cladistic tree by combining all or part of the three-item statements deduced from several prior cladistic trees. In order to formalise such procedures, several k-adic rules of inference, i.e., rules that allow us to deduce at least one new three-item statement from exactly k other ones, have been identified. However, no axiomatic background has been proposed, and it remains unknown if a particular k-adic rule of inference can be reduced to more basic rules. In order to solve this problem, we propose here to define three-item statements in terms of degree of equivalence relations. Given both the axiomatic definition of the latter and their strong connection to hierarchical classifications, we establish a list of the most basic properties for three-item statements. With such an approach, we show that it is possible to combine five three-item statements from basic rules although they are not combinable only from dyadic rules. Such a result suggests that all higher k-adic rules are well reducible to a finite set of simpler rules.

Download Full-text

Information Content of Trees: Three-taxon Statements Inference Rules and Dependency

10.1101/2020.06.08.141515 ◽

2020 ◽

Cited By ~ 1

Author(s):

Valentin Rineau ◽

René Zaragüeta ◽

Jérémie Bardin

Keyword(s):

Information Content ◽

Evolutionary Biology ◽

Rooted Trees ◽

Consensus Methods ◽

Distance Metrics ◽

Phylogenetic Studies ◽

Efficient Measure ◽

Tree Methods ◽

Cladistic Biogeography ◽

Supertree Methods

ABSTRACTThe three-taxon statement (also called triplet) is the fundamental unit of rooted trees in phylogenetic systematics. Various supertree and phylogenetic methods use three-taxon statements that are minimal rooted statements of degree of kinship relationships. Because of their fundamental role in phylogenetics, three-taxon statements are present in methodological research of various disciplines in evolutionary biology, as in consensus methods, supertree methods, species-tree methods, distance metrics, phylogenetics, and cladistic biogeography. Three-taxon statements are thus widely used. However, their theoretical properties have been poorly investigated. As a result, three-taxon statements methods are subject to important flaws related to information redundancy. Correcting these biases is essential to improve the efficiency of methods using three-taxon statements. Our aim is to study the behavior of three-taxon statements and the interactions among them in order to enhance their performance in phylogenetic studies. We have identified new types of very specific interactions between three-taxon statements responsible of the emergence of redundancy and dependency in trees. We propose for the first time a classification of three-taxon statements interactions and trace the link between those and the emergence of dependency and redundancy. A new fractional weighting procedure for suppressing redundancy of three-taxon statements is proposed. Our method is subsequently empirically tested in the supertree framework using simulations. We show that three-taxon statements using fractional weights perform drastically better than classical supertree methods such as MRP or methods using unweighted three-taxon statements. Our study shows that appropriate fractional weighting of three taxon statements is an efficient measure of phylogenetic information content for rooted trees. Fractional weighting is of critical importance for removing redundancy in any method using three-taxon statements, as in consensus, supertrees, distance metrics, and phylogenetic or biogeographic analyses.

Download Full-text

A Phylogenomic Supertree of Birds

Diversity ◽

10.3390/d11070109 ◽

2019 ◽

Vol 11 (7) ◽

pp. 109 ◽

Cited By ~ 17

Author(s):

Rebecca T. Kimball ◽

Carl H. Oliveros ◽

Ning Wang ◽

Noor D. White ◽

F. Keith Barker ◽

...

Keyword(s):

Large Scale ◽

Sequence Data ◽

Bird Species ◽

Divide And Conquer ◽

Clear Understanding ◽

Whole Genome ◽

Efficient Manner ◽

Sequence Capture ◽

Branch Lengths ◽

Supertree Methods

It has long been appreciated that analyses of genomic data (e.g., whole genome sequencing or sequence capture) have the potential to reveal the tree of life, but it remains challenging to move from sequence data to a clear understanding of evolutionary history, in part due to the computational challenges of phylogenetic estimation using genome-scale data. Supertree methods solve that challenge because they facilitate a divide-and-conquer approach for large-scale phylogeny inference by integrating smaller subtrees in a computationally efficient manner. Here, we combined information from sequence capture and whole-genome phylogenies using supertree methods. However, the available phylogenomic trees had limited overlap so we used taxon-rich (but not phylogenomic) megaphylogenies to weave them together. This allowed us to construct a phylogenomic supertree, with support values, that included 707 bird species (~7% of avian species diversity). We estimated branch lengths using mitochondrial sequence data and we used these branch lengths to estimate divergence times. Our time-calibrated supertree supports radiation of all three major avian clades (Palaeognathae, Galloanseres, and Neoaves) near the Cretaceous-Paleogene (K-Pg) boundary. The approach we used will permit the continued addition of taxa to this supertree as new phylogenomic data are published, and it could be applied to other taxa as well.

Download Full-text

BCD Beam Search: considering suboptimal partial solutions in Bad Clade Deletion supertrees

PeerJ ◽

10.7717/peerj.4987 ◽

2018 ◽

Vol 6 ◽

pp. e4987

Author(s):

Markus Fleischauer ◽

Sebastian Böcker

Keyword(s):

Matrix Representation ◽

Search Algorithm ◽

Simulated Data ◽

Biological Data ◽

Beam Search ◽

Worst Case ◽

Running Time ◽

Minimum Cuts ◽

Supertree Methods

Supertree methods enable the reconstruction of large phylogenies. The supertree problem can be formalized in different ways in order to cope with contradictory information in the input. Some supertree methods are based on encoding the input trees in a matrix; other methods try to find minimum cuts in some graph. Recently, we introduced Bad Clade Deletion (BCD) supertrees which combines the graph-based computation of minimum cuts with optimizing a global objective function on the matrix representation of the input trees. The BCD supertree method has guaranteed polynomial running time and is very swift in practice. The quality of reconstructed supertrees was superior to matrix representation with parsimony (MRP) and usually on par with SuperFine for simulated data; but particularly for biological data, quality of BCD supertrees could not keep up with SuperFine supertrees. Here, we present a beam search extension for the BCD algorithm that keeps alive a constant number of partial solutions in each top-down iteration phase. The guaranteed worst-case running time of the new algorithm is still polynomial in the size of the input. We present an exact and a randomized subroutine to generate suboptimal partial solutions. Both beam search approaches consistently improve supertree quality on all evaluated datasets when keeping 25 suboptimal solutions alive. Supertree quality of the BCD Beam Search algorithm is on par with MRP and SuperFine even for biological data. This is the best performance of a polynomial-time supertree algorithm reported so far.

Download Full-text

Phylogenomic Reconstruction of the Oomycete Phylogeny Derived from 37 Genomes

mSphere ◽

10.1128/msphere.00095-17 ◽

2017 ◽

Vol 2 (2) ◽

Cited By ~ 32

Author(s):

Charley G. P. McCarthy ◽

David A. Fitzpatrick

Keyword(s):

Large Scale ◽

Plant Pathogens ◽

Single Gene ◽

Genomic Data ◽

Gene Families ◽

Phylogenomic Analysis ◽

Phylogenetic Studies ◽

A Genome ◽

Supertree Methods ◽

Genome Scale

ABSTRACT The oomycetes are a class of eukaryotes and include ecologically significant animal and plant pathogens. Single-gene and multigene phylogenetic studies of individual oomycete genera and of members of the larger classes have resulted in conflicting conclusions concerning interspecies relationships among these species, particularly for the Phytophthora genus. The onset of next-generation sequencing techniques now means that a wealth of oomycete genomic data is available. For the first time, we have used genome-scale phylogenetic methods to resolve oomycete phylogenetic relationships. We used supertree methods to generate single-gene and multigene species phylogenies. Overall, our supertree analyses utilized phylogenetic data from 8,355 oomycete gene families. We have also complemented our analyses with superalignment phylogenies derived from 131 single-copy ubiquitous gene families. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and clades. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. The oomycetes are a class of microscopic, filamentous eukaryotes within the Stramenopiles-Alveolata-Rhizaria (SAR) supergroup which includes ecologically significant animal and plant pathogens, most infamously the causative agent of potato blight Phytophthora infestans. Single-gene and concatenated phylogenetic studies both of individual oomycete genera and of members of the larger class have resulted in conflicting conclusions concerning species phylogenies within the oomycetes, particularly for the large Phytophthora genus. Genome-scale phylogenetic studies have successfully resolved many eukaryotic relationships by using supertree methods, which combine large numbers of potentially disparate trees to determine evolutionary relationships that cannot be inferred from individual phylogenies alone. With a sufficient amount of genomic data now available, we have undertaken the first whole-genome phylogenetic analysis of the oomycetes using data from 37 oomycete species and 6 SAR species. In our analysis, we used established supertree methods to generate phylogenies from 8,355 homologous oomycete and SAR gene families and have complemented those analyses with both phylogenomic network and concatenated supermatrix analyses. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and individual clades within the problematic Phytophthora genus. Support for the resolution of the inferred relationships between individual Phytophthora clades varies depending on the methodology used. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. IMPORTANCE The oomycetes are a class of eukaryotes and include ecologically significant animal and plant pathogens. Single-gene and multigene phylogenetic studies of individual oomycete genera and of members of the larger classes have resulted in conflicting conclusions concerning interspecies relationships among these species, particularly for the Phytophthora genus. The onset of next-generation sequencing techniques now means that a wealth of oomycete genomic data is available. For the first time, we have used genome-scale phylogenetic methods to resolve oomycete phylogenetic relationships. We used supertree methods to generate single-gene and multigene species phylogenies. Overall, our supertree analyses utilized phylogenetic data from 8,355 oomycete gene families. We have also complemented our analyses with superalignment phylogenies derived from 131 single-copy ubiquitous gene families. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and clades. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes.

Download Full-text

Collecting reliable clades using the Greedy Strict Consensus Merger

PeerJ ◽

10.7717/peerj.2172 ◽

2016 ◽

Vol 4 ◽

pp. e2172 ◽

Cited By ~ 5

Author(s):

Markus Fleischauer ◽

Sebastian Böcker

Keyword(s):

Computational Complexity ◽

Phylogenetic Trees ◽

Optimization Problems ◽

Matrix Representation ◽

Phylogenetic Inference ◽

Scoring Functions ◽

True Positive ◽

Worst Case ◽

Inference Methods ◽

Supertree Methods

Supertree methods combine a set of phylogenetic trees into a single supertree. Similar to supermatrix methods, these methods provide a way to reconstruct larger parts of the Tree of Life, potentially evading the computational complexity of phylogenetic inference methods such as maximum likelihood. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. Many supertree methods have been developed. Some of them solve NP-hard optimization problems like the well-known Matrix Representation with Parsimony, while others have polynomial worst-case running time but work in a greedy fashion (FlipCut). Both can profit from a set of clades that are already known to be part of the supertree. The Superfine approach shows how the Greedy Strict Consensus Merger (GSCM) can be used as preprocessing to find these clades. We introduce different scoring functions for the GSCM, a randomization, as well as a combination thereof to improve the GSCM to find more clades. This helps, in turn, to improve the resolution of the GSCM supertree. We find this modifications to increase the number of true positive clades by 18% compared to the currently used Overlap scoring.

Download Full-text

Supertree Methods, Phylogenetic

Encyclopedia of Evolutionary Biology ◽

10.1016/b978-0-12-800049-6.00222-5 ◽

2016 ◽

pp. 250-255

Author(s):

J.G. Burleigh

Keyword(s):

Supertree Methods

Download Full-text

Collecting reliable clades using the Greedy Strict Consensus Merger

10.7287/peerj.preprints.1297 ◽

2015 ◽

Author(s):

Markus Fleischauer ◽

Sebastian Böcker

Keyword(s):

Computational Complexity ◽

Phylogenetic Trees ◽

Optimization Problems ◽

Matrix Representation ◽

Phylogenetic Inference ◽

Scoring Functions ◽

True Positive ◽

Worst Case ◽

Inference Methods ◽

Supertree Methods

Supertree methods combine a set of phylogenetic trees into a single supertree. Similar to supermatrix methods, these methods provide a way to reconstruct larger parts of the Tree of Life, potentially evading the computational complexity of phylogenetic inference methods such as maximum likelihood. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. Many supertree methods have been developed. Some of them solve NP-hard optimization problems like the well known Matrix Representation with Parsimony, others have polynomial worst-case running time but work in a greedy fashion (FlipCut). Both can profit from a set of clades that are already known to be part of the supertree. The Superfine approach shows how the Greedy Strict Consensus Merger (GSCM) can be used as preprocessing to find these clades. We introduce different scoring functions for the GSCM, a randomization, as well as a combination thereof to improve the GSCM to find more clades. This helps, in turn, to improve the resolution of the final supertree. We find this modifications to increase the number of true positive clades by 16% while decreasing the number of false positive clades by 3% compared to the currently used Overlap scoring.

Download Full-text

Collecting reliable clades using the Greedy Strict Consensus Merger

10.7287/peerj.preprints.1297v3 ◽

2015 ◽

Author(s):

Markus Fleischauer ◽

Sebastian Böcker

Keyword(s):

Computational Complexity ◽

Phylogenetic Trees ◽

Optimization Problems ◽

Matrix Representation ◽

Phylogenetic Inference ◽

Scoring Functions ◽

True Positive ◽

Worst Case ◽

Inference Methods ◽

Supertree Methods

Supertree methods combine a set of phylogenetic trees into a single supertree. Similar to supermatrix methods, these methods provide a way to reconstruct larger parts of the Tree of Life, potentially evading the computational complexity of phylogenetic inference methods such as maximum likelihood. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. Many supertree methods have been developed. Some of them solve NP-hard optimization problems like the well known Matrix Representation with Parsimony, others have polynomial worst-case running time but work in a greedy fashion (FlipCut). Both can profit from a set of clades that are already known to be part of the supertree. The Superfine approach shows how the Greedy Strict Consensus Merger (GSCM) can be used as preprocessing to find these clades. We introduce different scoring functions for the GSCM, a randomization, as well as a combination thereof to improve the GSCM to find more clades. This helps, in turn, to improve the resolution of the final supertree. We find this modifications to increase the number of true positive clades by 16% while decreasing the number of false positive clades by 3% compared to the currently used Overlap scoring.

Download Full-text

Collecting reliable clades using the Greedy Strict Consensus Merger

10.7287/peerj.preprints.1297v2 ◽

2015 ◽

Author(s):

Markus Fleischauer ◽

Sebastian Böcker

Keyword(s):

Computational Complexity ◽

Phylogenetic Trees ◽

Optimization Problems ◽

Matrix Representation ◽

Phylogenetic Inference ◽

Scoring Functions ◽

True Positive ◽

Worst Case ◽

Inference Methods ◽

Supertree Methods

Supertree methods combine a set of phylogenetic trees into a single supertree. Similar to supermatrix methods, these methods provide a way to reconstruct larger parts of the Tree of Life, potentially evading the computational complexity of phylogenetic inference methods such as maximum likelihood. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. Many supertree methods have been developed. Some of them solve NP-hard optimization problems like the well known Matrix Representation with Parsimony, others have polynomial worst-case running time but work in a greedy fashion (FlipCut). Both can profit from a set of clades that are already known to be part of the supertree. The Superfine approach shows how the Greedy Strict Consensus Merger (GSCM) can be used as preprocessing to find these clades. We introduce different scoring functions for the GSCM, a randomization, as well as a combination thereof to improve the GSCM to find more clades. This helps, in turn, to improve the resolution of the final supertree. We find this modifications to increase the number of true positive clades by 16% while decreasing the number of false positive clades by 3% compared to the currently used Overlap scoring.

Download Full-text

supertree methods
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Cladistic hypotheses as degree of equivalence relational structures: implications for three-item statements

Information Content of Trees: Three-taxon Statements Inference Rules and Dependency

A Phylogenomic Supertree of Birds

BCD Beam Search: considering suboptimal partial solutions in Bad Clade Deletion supertrees

Phylogenomic Reconstruction of the Oomycete Phylogeny Derived from 37 Genomes

Collecting reliable clades using the Greedy Strict Consensus Merger

Supertree Methods, Phylogenetic

Collecting reliable clades using the Greedy Strict Consensus Merger

Collecting reliable clades using the Greedy Strict Consensus Merger

Collecting reliable clades using the Greedy Strict Consensus Merger

Export Citation Format

supertree methodsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Cladistic hypotheses as degree of equivalence relational structures: implications for three-item statements

Information Content of Trees: Three-taxon Statements Inference Rules and Dependency

A Phylogenomic Supertree of Birds

BCD Beam Search: considering suboptimal partial solutions in Bad Clade Deletion supertrees

Phylogenomic Reconstruction of the Oomycete Phylogeny Derived from 37 Genomes

Collecting reliable clades using the Greedy Strict Consensus Merger

Supertree Methods, Phylogenetic

Collecting reliable clades using the Greedy Strict Consensus Merger

Collecting reliable clades using the Greedy Strict Consensus Merger

Collecting reliable clades using the Greedy Strict Consensus Merger

supertree methods
Recently Published Documents