scholarly journals Anchored Phylogenomics of Angiosperms I: Assessing the Robustness of Phylogenetic Estimates

2016 ◽  
Author(s):  
Chris Buddenhagen ◽  
Alan R. Lemmon ◽  
Emily Moriartya Lemmon ◽  
Jeremy Bruhl ◽  
Jennifer Cappa ◽  
...  

ABSTRACTAn important goal of the angiosperm systematics community has been to develop a shared approach to molecular data collection, such that phylogenomic data sets from different focal clades can be combined for meta-studies across the entire group. Although significant progress has been made through efforts such as DNA barcoding, transcriptome sequencing, and whole-plastid sequencing, the community current lacks a cost efficient methodology for collecting nuclear phylogenomic data across all angiosperms. Here, we leverage genomic resources from 43 angiosperm species to develop enrichment probes useful for collecting ~500 loci from non-model taxa across the diversity of angiosperms. By taking an anchored phylogenomics approach, in which probes are designed to represent sequence diversity across the group, we are able to efficiently target loci with sufficient phylogenetic signal to resolve deep, intermediate, and shallow angiosperm relationships. After demonstrating the utility of this resource, we present a method that generates a heat map for each node on a phylogeny that reveals the sensitivity of support for the node across analysis conditions, as well as different locus, site, and taxon schemes. Focusing on the effect of locus and site sampling, we use this approach to statistically evaluate relative support for the alternative relationships among eudicots, monocots, and magnoliids. Although the results from supermatrix and coalescent analyses are largely consistent across the tree, we find support for this deep relationship to be more sensitive to the particular choice of sites and loci when a supermatrix approach as employed. Averaged across analysis approaches and data subsampling schemes, our data support a eudicot-monocot sister relationship, which is supported by a number of recent angiosperm studies.

2017 ◽  
Author(s):  
Ross Mounce

In this thesis I attempt to gather together a wide range of cladistic analyses of fossil and extant taxa representing a diverse array of phylogenetic groups. I use this data to quantitatively compare the effect of fossil taxa relative to extant taxa in terms of support for relationships, number of most parsimonious trees (MPTs) and leaf stability. In line with previous studies I find that the effects of fossil taxa are seldom different to extant taxa – although I highlight some interesting exceptions. I also use this data to compare the phylogenetic signal within vertebrate morphological data sets, by choosing to compare cranial data to postcranial data. Comparisons between molecular data and morphological data have been previously well explored, as have signals between different molecular loci. But comparative signal within morphological data sets is much less commonly characterized and certainly not across a wide array of clades. With this analysis I show that there are many studies in which the evidence provided by cranial data appears to be be significantly incongruent with the postcranial data – more than one would expect to see just by the effect of chance and noise alone. I devise and implement a modification to a rarely used measure of homoplasy that will hopefully encourage its wider usage. Previously it had some undesirable bias associated with the distribution of missing data in a dataset, but my modification controls for this. I also take an in-depth and extensive review of the ILD test, noting it is often misused or reported poorly, even in recent studies. Finally, in attempting to collect data and metadata on a large scale, I uncovered inefficiencies in the research publication system that obstruct re-use of data and scientific progress. I highlight the importance of replication and reproducibility – even simple reanalysis of high profile papers can turn up some very different results. Data is highly valuable and thus it must be retained and made available for further re-use to maximize the overall return on research investment.


Zootaxa ◽  
2011 ◽  
Vol 2946 (1) ◽  
pp. 45 ◽  
Author(s):  
ROBERT H. CRUICKSHANK

Mooi & Gill (2010) have made a number of criticisms of statistical approaches to the phylogenetic analysis of molecular data as it is currently practiced. There are many different uses for molecular phylogenies, and for most of them statistical methods are entirely appropriate, but for taxonomic purposes the way that these methods have been used is questionable. In these cases it is necessary to introduce an extra step into the analysis – exploration of character conflict. Existing methods for exploring character conflict in molecular data such as spectral analysis, phylogenetic networks, likelihood mapping and sliding window analyses are briefly reviewed, but there is also a need for development of new tools to facilitate the analysis of large data sets. Incorporation of previous phylogenies as priors in Bayesian analyses could help to provide taxonomic stability, while still leaving room for new data to alter these conclusions if they contain sufficiently strong phylogenetic signal. Molecular phylogeneticists should make a clearer distinction between the different uses to which their phylogenies are put; methods suitable in one context may not be appropriate in others.


2020 ◽  
Vol 69 (4) ◽  
pp. 613-622 ◽  
Author(s):  
Rong Zhang ◽  
Yin-Huan Wang ◽  
Jian-Jun Jin ◽  
Gregory W Stull ◽  
Anne Bruneau ◽  
...  

Abstract Phylogenomic analyses have helped resolve many recalcitrant relationships in the angiosperm tree of life, yet phylogenetic resolution of the backbone of the Leguminosae, one of the largest and most economically and ecologically important families, remains poor due to generally limited molecular data and incomplete taxon sampling of previous studies. Here, we resolve many of the Leguminosae’s thorniest nodes through comprehensive analysis of plastome-scale data using multiple modified coding and noncoding data sets of 187 species representing almost all major clades of the family. Additionally, we thoroughly characterize conflicting phylogenomic signal across the plastome in light of the family’s complex history of plastome evolution. Most analyses produced largely congruent topologies with strong statistical support and provided strong support for resolution of some long-controversial deep relationships among the early diverging lineages of the subfamilies Caesalpinioideae and Papilionoideae. The robust phylogenetic backbone reconstructed in this study establishes a framework for future studies on legume classification, evolution, and diversification. However, conflicting phylogenetic signal was detected and quantified at several key nodes that prevent the confident resolution of these nodes using plastome data alone. [Leguminosae; maximum likelihood; phylogenetic conflict; plastome; recalcitrant relationships; stochasticity; systematic error.]


Zootaxa ◽  
2012 ◽  
Vol 3390 (1) ◽  
pp. 1 ◽  
Author(s):  
GEOFFREY M. KAY ◽  
J. SCOTT KEOGH

Ctenotus is the largest and most diverse genus of skinks in Australia with at least 97 described species. We generated largemitochondrial and nuclear DNA data sets for 70 individuals representing all available species in the C. labillardieri species-group to produce the first comprehensive phylogeny for this clade. The widespread C. labillardieri was sampled extensively toprovide the first detailed phylogeographic data set for a reptile in the southwestern Australian biodiversity hotspot. Wesupplemented our molecular data with a comprehensive morphological dataset for the entire group, and together these data areused to revise the group and describe a new species. The morphologically highly variable species C. labillardieri comprisesseven well-supported genetic clades that each occupy distinct geographic regions. The phylogeographic patterns observed inthis taxon are consistent with studies of frogs, plants and invertebrates, adding strength to emerging biogeographic hypothesesin this iconic region. The species C. catenifer, C. youngsoni, and C. gemmula are well supported, and despite limited samplingboth C. catenifer and C. gemmula show substantial genetic structure. The threatened C. lancelini from Lancelin Island and theadjacent mainland is the sister taxon to a new species from the Swan Coastal Plain, which we describe as C. ora sp. nov. Thisspecies is a habitat specialist, occurring primarily in sandy regions south of Perth that currently are under intense development. Ctenotus ora sp. nov. should be considered for conservation attention immediately.


Genetics ◽  
1996 ◽  
Vol 144 (4) ◽  
pp. 1817-1833 ◽  
Author(s):  
Michel C Milinkovitch ◽  
Richard G LeDuc ◽  
Jun Adachi ◽  
Frederic Farnir ◽  
Michel Georges ◽  
...  

Different phylogenetic analyses of the same genetic data set can yield conflicting results, depending on the choice of parameter settings and included taxa. This is particularly true in studies involving data sets where levels of homoplasy are high and likely to obscure the phylogenetic signal. Filtering of this phylogenetic noise can be attempted, with varying degrees of success, by using different weighting schemes and ingroup/outgroup choices, but it can be difficult to decide objectively which approach is best. Using a cytochrome b data set from cetaceans and artiodactyls, we examined the effects of a suite of parameter settings on the outcome of phylogenetic analyses. We tested 2968 combinations among the seven parameters that most often vary among phylogenetic studies. It is our contention that this sensitivity analysis identifies portions of the multidimensional parameter space where phylogenetic signal is most reliably recovered, and simple rules are given to guide the choice of settings. Portions of this data set have been used in previous studies with conflicting results, namely the monophyly vs. paraphyly of one of the two major recognized cetacean suborders, the toothed whales. This analysis strongly supports the sister relationship between sperm whales and baleen whales.


Author(s):  
Abou_el_ela Abdou Hussein

Day by day advanced web technologies have led to tremendous growth amount of daily data generated volumes. This mountain of huge and spread data sets leads to phenomenon that called big data which is a collection of massive, heterogeneous, unstructured, enormous and complex data sets. Big Data life cycle could be represented as, Collecting (capture), storing, distribute, manipulating, interpreting, analyzing, investigate and visualizing big data. Traditional techniques as Relational Database Management System (RDBMS) couldn’t handle big data because it has its own limitations, so Advancement in computing architecture is required to handle both the data storage requisites and the weighty processing needed to analyze huge volumes and variety of data economically. There are many technologies manipulating a big data, one of them is hadoop. Hadoop could be understand as an open source spread data processing that is one of the prominent and well known solutions to overcome handling big data problem. Apache Hadoop was based on Google File System and Map Reduce programming paradigm. Through this paper we dived to search for all big data characteristics starting from first three V's that have been extended during time through researches to be more than fifty six V's and making comparisons between researchers to reach to best representation and the precise clarification of all big data V’s characteristics. We highlight the challenges that face big data processing and how to overcome these challenges using Hadoop and its use in processing big data sets as a solution for resolving various problems in a distributed cloud based environment. This paper mainly focuses on different components of hadoop like Hive, Pig, and Hbase, etc. Also we institutes absolute description of Hadoop Pros and cons and improvements to face hadoop problems by choosing proposed Cost-efficient Scheduler Algorithm for heterogeneous Hadoop system.


2017 ◽  
pp. 99
Author(s):  
Pamela S. Soltis ◽  
Douglas E. Soltis

Technological advances in molecular biology have greatly increased the speed and efficiency of DNA sequencing, making it possible to construct large molecular data sets for phylogeny reconstruction relatively quickly. Despite their potential for improving our understanding of phylogeny, these large data sets also provide many challenges. In this paper, we discuss several of these challenges, including 1) the failure of a search to find the most parsimonious trees (the local optimum) in a reasonable amount of time, 2) the difference between a local optimum and the global optimum, and 3) the existence of multiple classes (islands) of most parsimonious trees. We also discuss possible strategies to improve the' likelihood of finding the most parsimonious tree(s) and present two examples from our work on angiosperm phylogeny. We conclude with a discussion of two alternatives to analyses of entire large data sets, the exemplar approach and compartmentalization, and suggest that additional consideration must be given to issues of data analysis for large data sets, whether morphological or molecular.


2009 ◽  
Vol 34 (3) ◽  
pp. 580-594 ◽  
Author(s):  
Anthony R. Magee ◽  
Ben-Erik van Wyk ◽  
Patricia M. Tilney ◽  
Stephen R. Downie

Generic circumscriptions and phylogenetic relationships of the Cape genera Capnophyllum, Dasispermum, and Sonderina are explored through parsimony and Bayesian inference analyses of nrDNA ITS and cpDNA rps16 intron sequences, morphology, and combined molecular and morphological data. The relationship of these genera with the North African genera Krubera and Stoibrax is also assessed. Analyses of both molecular data sets place Capnophyllum, Dasispermum, Sonderina, and the only southern African species of Stoibrax (S. capense) within the newly recognized Lefebvrea clade of tribe Tordylieae. Capnophyllum is strongly supported as monophyletic and is distantly related to Krubera. The monotypic genus Dasispermum and Stoibrax capense are embedded within a paraphyletic Sonderina. This complex is distantly related to the North African species of Stoibrax in tribe Apieae, in which the type species, Stoibrax dichotomum, occurs. Consequently, Dasispermum is expanded to include both Sonderina and Stoibrax capense. New combinations are formalized for Dasispermum capense, D. hispidum, D. humile, and D. tenue. An undescribed species from the Tanqua Karoo in South Africa is also closely related to Capnophyllum and the Dasispermum–Sonderina complex. The genus Scaraboides is described herein to accommodate the new species, S. manningii. This monotypic genus shares the dorsally compressed fruit and involute marginal wings with Capnophyllum, but is easily distinguished by its erect branching habit, green leaves, scabrous umbels, and fruit with indistinct median and lateral ribs, additional solitary vittae in each marginal wing, and parallel, closely spaced commissural vittae. Despite the marked fruit similarities with Capnophyllum, analyses of DNA sequence data place Scaraboides closer to the Dasispermum–Sonderina complex, with which it shares the erect habit, green (nonglaucous) leaves, and scabrous umbels.


Zootaxa ◽  
2004 ◽  
Vol 680 (1) ◽  
pp. 1 ◽  
Author(s):  
ARNE NYGREN

Autolytinae is revised based on available types, and newly collected specimens. Out of 170 nominal species, 18 are considered as incertae sedis, 43 are regarded as junior synonyms, and 25 are referred to as nomina dubia. The relationships of Autolytinae is assessed from 51 morphological characters and 211 states for 76 ingroup-taxa, and 460 molecular characters from mitochondrial 16S rDNA and nuclear 18S rDNA for 31 ingroup-taxa; outgroups include 12 non-autolytine syllid polychaetes. Two analyses are provided, one including morphological data only, and one with combined morphological and molecular data sets. The resulting strict consensus tree from the combined data is chosen for a reclassification. Three main clades are identified: Procerini trib. n., Autolytini Grube, 1850, and Epigamia gen. n. Proceraea Ehlers, 1864 and Myrianida Milne Edwards, 1845 are referred to as nomen protectum, while Scolopendra Slabber, 1781, Podonereis Blainville, 1818, Amytis Savigny, 1822, Polynice Savigny, 1822, and Nereisyllis Blainville, 1828 are considered


Sign in / Sign up

Export Citation Format

Share Document