scholarly journals Congruence between morphology-based species and Barcode Index Numbers (BINs) in Neotropical Eumaeini (Lycaenidae)

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11843
Author(s):  
Carlos Prieto ◽  
Christophe Faynel ◽  
Robert Robbins ◽  
Axel Hausmann

Background With about 1,000 species in the Neotropics, the Eumaeini (Theclinae) are one of the most diverse butterfly tribes. Correct morphology-based identifications are challenging in many genera due to relatively little interspecific differences in wing patterns. Geographic infraspecific variation is sometimes more substantial than variation between species. In this paper we present a large DNA barcode dataset of South American Lycaenidae. We analyze how well DNA barcode BINs match morphologically delimited species. Methods We compare morphology-based species identifications with the clustering of molecular operational taxonomic units (MOTUs) delimitated by the RESL algorithm in BOLD, which assigns Barcode Index Numbers (BINs). We examine intra- and interspecific divergences for genera represented by at least four morphospecies. We discuss the existence of local barcode gaps in a genus by genus analysis. We also note differences in the percentage of species with barcode gaps in groups of lowland and high mountain genera. Results We identified 2,213 specimens and obtained 1,839 sequences of 512 species in 90 genera. Overall, the mean intraspecific divergence value of CO1 sequences was 1.20%, while the mean interspecific divergence between nearest congeneric neighbors was 4.89%, demonstrating the presence of a barcode gap. However, the gap seemed to disappear from the entire set when comparing the maximum intraspecific distance (8.40%) with the minimum interspecific distance (0.40%). Clear barcode gaps are present in many genera but absent in others. From the set of specimens that yielded COI fragment lengths of at least 650 bp, 75% of the a priori morphology-based identifications were unambiguously assigned to a single Barcode Index Number (BIN). However, after a taxonomic a posteriori review, the percentage of matched identifications rose to 85%. BIN splitting was observed for 17% of the species and BIN sharing for 9%. We found that genera that contain primarily lowland species show higher percentages of local barcode gaps and congruence between BINs and morphology than genera that contain exclusively high montane species. The divergence values to the nearest neighbors were significantly lower in high Andean species while the intra-specific divergence values were significantly lower in the lowland species. These results raise questions regarding the causes of observed low inter and high intraspecific genetic variation. We discuss incomplete lineage sorting and hybridization as most likely causes of this phenomenon, as the montane species concerned are relatively young and hybridization is probable. The release of our data set represents an essential baseline for a reference library for biological assessment studies of butterflies in mega diverse countries using modern high-throughput technologies an highlights the necessity of taxonomic revisions for various genera combining both molecular and morphological data.

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4577 ◽  
Author(s):  
Nadine Havemann ◽  
Martin M. Gossner ◽  
Lars Hendrich ◽  
Jèrôme Morinière ◽  
Rolf Niedringhaus ◽  
...  

With about 5,000 species worldwide, the Heteroptera or true bugs are the most diverse taxon among the hemimetabolous insects in aquatic and semi-aquatic ecosystems. Species may be found in almost every freshwater environment and have very specific habitat requirements, making them excellent bioindicator organisms for water quality. However, a correct determination by morphology is challenging in many species groups due to high morphological variability and polymorphisms within, but low variability between species. Furthermore, it is very difficult or even impossible to identify the immature life stages or females of some species, e.g., of the corixid genus Sigara. In this study we tested the effectiveness of a DNA barcode library to discriminate species of the Gerromorpha and Nepomorpha of Germany. We analyzed about 700 specimens of 67 species, with 63 species sampled in Germany, covering more than 90% of all recorded species. Our library included various morphological similar taxa, e.g., species within the genera Sigara and Notonecta as well as water striders of the genus Gerris. Fifty-five species (82%) were unambiguously assigned to a single Barcode Index Number (BIN) by their barcode sequences, whereas BIN sharing was observed for 10 species. Furthermore, we found monophyletic lineages for 52 analyzed species. Our data revealed interspecific K2P distances with below 2.2% for 18 species. Intraspecific distances above 2.2% were shown for 11 species. We found evidence for hybridization between various corixid species (Sigara, Callicorixa), but our molecular data also revealed exceptionally high intraspecific distances as a consequence of distinct mitochondrial lineages for Cymatia coleoptrata and the pygmy backswimmer Plea minutissima. Our study clearly demonstrates the usefulness of DNA barcodes for the identification of the aquatic Heteroptera of Germany and adjacent regions. In this context, our data set represents an essential baseline for a reference library for bioassessment studies of freshwater habitats using modern high-throughput technologies in the near future. The existing data also opens new questions regarding the causes of observed low inter- and high intraspecific genetic variation and furthermore highlight the necessity of taxonomic revisions for various taxa, combining both molecular and morphological data.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e6476 ◽  
Author(s):  
Andrinajoro R. Rakotoarivelo ◽  
Paul O’Donoghue ◽  
Michael W. Bruford ◽  
Yoshan Moodley

Background The bushbuck, Tragelaphus scriptus, is a widespread and ecologically diverse ungulate species complex within the spiral-horned antelopes. This species was recently found to consist of two genetically divergent but monophyletic lineages, which are paraphyletic at mitochondrial (mt)DNA owing to an ancient interspecific hybridization event. The Scriptus lineage (T. s. scriptus) inhabits the north-western half of the African continent while Sylvaticus (T. s. sylvaticus) is found in the south-eastern half. Here we test hypotheses of historical demography and adaptation in bushbuck using a higher-resolution framework, with four nuclear (MGF, PRKCI, SPTBN, and THY) and three new mitochondrial markers (cytochrome b, 12S rRNA, and 16S rRNA). Methods Genealogies were reconstructed for the mitochondrial and nuclear data sets, with the latter dated using fossil calibration points. We also inferred the demographic history of Scriptus and Sylvaticus using coalescent-based methods. To obtain an overview of the origins and ancestral colonisation routes of ancestral bushbuck sequences across geographic space, we conducted discrete Bayesian phylogeographic and statistical dispersal-vicariance analyses on our nuclear DNA data set. Results Both nuclear DNA and mtDNA support previous findings of two genetically divergent Sylvaticus and Scriptus lineages. The three mtDNA loci confirmed 15 of the previously defined haplogroups, including those with convergent phenotypes. However, the nuclear tree showed less phylogenetic resolution at the more derived parts of the genealogy, possibly due to incomplete lineage sorting of the slower evolving nuclear genome. The only exception to this was the montane Menelik’s bushbuck (Sylvaticus) of the Ethiopian highlands, which formed a monophyletic group at three of four nuclear DNA loci. We dated the coalescence of the two lineages to a common ancestor ∼2.54 million years ago. Both marker sets revealed similar demographic histories of constant population size over time. We show that the bushbuck likely originated in East Africa, with Scriptus dispersing to colonise suitable habitats west of the African Rift and Sylvaticus radiating from east of the Rift into southern Africa via a series of mainly vicariance events. Discussion Despite lower levels of genetic structure at nuclear loci, we confirmed the independent evolution of the Menelik’s bushbuck relative to the phenotypically similar montane bushbuck in East Africa, adding further weight to previous suggestions of convergent evolution within the bushbuck complex. Perhaps the most surprising result of our analysis was that both Scriptus and Sylvaticus populations remained relatively constant throughout the Pleistocene, which is remarkable given that this was a period of major climatic and tectonic change in Africa, and responsible for driving the evolution of much of the continent’s extant large mammalian diversity.


Author(s):  
Daniel Lukic ◽  
Jonas Eberle ◽  
Jana Thormann ◽  
Carolus Holzschuh ◽  
Dirk Ahrens

DNA-barcoding and DNA-based species delimitation are major tools in DNA taxonomy. Sampling has been a central debate in this context, because the geographical composition of samples affect the accuracy and performance of DNA-barcoding. Performance of complex DNA-based species delimitation is to be tested under simpler conditions in absence of geographic sampling bias. Here, we present an empirical data set sampled from a single locality in a Southeast-Asian biodiversity hotspot (Laos: Phou Pan mountain). We investigate the performance of various species delimitation approaches on a megadiverse assemblage of herbivore chafer beetles (Coleoptera: Scarabaeidae) to infer whether species delimitation suffers in the same way from exaggerate infraspecific variation despite the lack of geographic genetic variation that led to inconsistencies between entities from DNA-based and morphology-based species inference in previous studies. For this purpose, a 658 bp fragment of the mitochondrial cytochrome c oxidase subunit 1 (cox1) was analysed for a total of 186 individuals of 56 morphospecies. Tree based and distance based species delimitation methods were used. All approaches showed a rather limited match ratio (max. 77%) with morphospecies. PTP and TCS prevailingly over-splitted morphospecies, while 3% clustering and ABGD also lumped several species into one entity. ABGD revealed the highest congruence between molecular operational taxonomic units (MOTUs) and morphospecies. Disagreements between morphospecies and MOTUs were discussed in the context of historically acquired geographic genetic differentiation, incomplete lineage sorting, and hybridization. The study once again highlights how important morphology still is in order to correctly interpret the results of molecular species delimitation.


AoB Plants ◽  
2020 ◽  
Vol 12 (3) ◽  
Author(s):  
Nannie L Persson ◽  
Ingrid Toresen ◽  
Heidi Lie Andersen ◽  
Jenny E E Smedmark ◽  
Torsten Eriksson

Abstract The genus Potentilla (Rosaceae) has been subjected to several phylogenetic studies, but resolving its evolutionary history has proven challenging. Previous analyses recovered six, informally named, groups: the Argentea, Ivesioid, Fragarioides, Reptans, Alba and Anserina clades, but the relationships among some of these clades differ between data sets. The Reptans clade, which includes the type species of Potentilla, has been noticed to shift position between plastid and nuclear ribosomal data sets. We studied this incongruence by analysing four low-copy nuclear markers, in addition to chloroplast and nuclear ribosomal data, with a set of Bayesian phylogenetic and Multispecies Coalescent (MSC) analyses. A selective taxon removal strategy demonstrated that the included representatives from the Fragarioides clade, P. dickinsii and P. fragarioides, were the main sources of the instability seen in the trees. The Fragarioides species showed different relationships in each gene tree, and were only supported as a monophyletic group in a single marker when the Reptans clade was excluded from the analysis. The incongruences could not be explained by allopolyploidy, but rather by homoploid hybridization, incomplete lineage sorting or taxon sampling effects. When P. dickinsii and P. fragarioides were removed from the data set, a fully resolved, supported backbone phylogeny of Potentilla was obtained in the MSC analysis. Additionally, indications of autopolyploid origins of the Reptans and Ivesioid clades were discovered in the low-copy gene trees.


Author(s):  
Diego F Morales-Briones ◽  
Gudrun Kadereit ◽  
Delphine T Tefarikis ◽  
Michael J Moore ◽  
Stephen A Smith ◽  
...  

Abstract Gene tree discordance in large genomic data sets can be caused by evolutionary processes such as incomplete lineage sorting and hybridization, as well as model violation, and errors in data processing, orthology inference, and gene tree estimation. Species tree methods that identify and accommodate all sources of conflict are not available, but a combination of multiple approaches can help tease apart alternative sources of conflict. Here, using a phylotranscriptomic analysis in combination with reference genomes, we test a hypothesis of ancient hybridization events within the plant family Amaranthaceae s.l. that was previously supported by morphological, ecological, and Sanger-based molecular data. The data set included seven genomes and 88 transcriptomes, 17 generated for this study. We examined gene-tree discordance using coalescent-based species trees and network inference, gene tree discordance analyses, site pattern tests of introgression, topology tests, synteny analyses, and simulations. We found that a combination of processes might have generated the high levels of gene tree discordance in the backbone of Amaranthaceae s.l. Furthermore, we found evidence that three consecutive short internal branches produce anomalous trees contributing to the discordance. Overall, our results suggest that Amaranthaceae s.l. might be a product of an ancient and rapid lineage diversification, and remains, and probably will remain, unresolved. This work highlights the potential problems of identifiability associated with the sources of gene tree discordance including, in particular, phylogenetic network methods. Our results also demonstrate the importance of thoroughly testing for multiple sources of conflict in phylogenomic analyses, especially in the context of ancient, rapid radiations. We provide several recommendations for exploring conflicting signals in such situations. [Amaranthaceae; gene tree discordance; hybridization; incomplete lineage sorting; phylogenomics; species network; species tree; transcriptomics.]


Genome ◽  
2018 ◽  
Vol 61 (1) ◽  
pp. 21-31 ◽  
Author(s):  
Jason Gibbs

There is an ongoing campaign to DNA barcode the world’s >20 000 bee species. Recent revisions of Lasioglossum (Dialictus) (Hymenoptera: Halictidae) for Canada and the eastern United States were completed using integrative taxonomy. DNA barcode data from 110 species of L. (Dialictus) are examined for their value in identification and discovering additional taxonomic diversity. Specimen identification success was estimated using the best close match method. Error rates were 20% relative to current taxonomic understanding. Barcode Index Numbers (BINs) assigned using Refined Single Linkage Analysis (RESL) and barcode gaps using the Automatic Barcode Gap Discovery (ABGD) method were also assessed. RESL was incongruent for 44.5% of species, although some cryptic diversity may exist. Forty-three of 110 species were part of merged BINs with multiple species. The barcode gap is non-existent for the data set as a whole and ABGD showed levels of discordance similar to the RESL. The viridatum species-group is particularly problematic, so that DNA barcodes alone would be misleading for species delimitation and specimen identification. Character-based methods using fixed nucleotide substitutions could improve specimen identification success in some cases. The use of DNA barcoding for species discovery for standard taxonomic practice in the absence of a well-defined barcode gap is discussed.


2020 ◽  
Author(s):  
Michael J. Sanderson ◽  
Michelle M. McMahon ◽  
Mike Steel

AbstractTerraces in phylogenetic tree space are sets of trees with identical optimality scores for a given data set, arising from missing data. These were first described for multilocus phylogenetic data sets in the context of maximum parsimony inference and maximum likelihood inference under certain model assumptions. Here we show how the mathematical properties that lead to terraces extend to gene tree - species tree problems in which the gene trees are incomplete. Inference of species trees from either sets of gene family trees subject to duplication and loss, or allele trees subject to incomplete lineage sorting, can exhibit terraces in their solution space. First, we show conditions that lead to a new kind of terrace, which stems from subtree operations that appear in reconciliation problems for incomplete trees. Then we characterize when terraces of both types can occur when the optimality criterion for tree search is based on duplication, loss or deep coalescence scores. Finally, we examine the impact of assumptions about the causes of losses: whether they are due to imperfect sampling or true evolutionary deletion.


2017 ◽  
Author(s):  
Graham Jones

AbstractThis paper focuses on the problem of estimating a species tree from multilocus data in the presence of incomplete lineage sorting and migration. We develop a mathematical model similar to IMa2 (Hey 2010) for the relevant evolutionary processes which allows both the the population size parameters and the migration rates between pairs of species tree branches to be integrated out. We then describe a BEAST2 package DENIM which based on this model, and which uses an approximation to sample from the posterior. The approximation is based on the assumption that migrations are rare, and it only samples from certain regions of the posterior which seem likely given this assumption. The method breaks down if there is a lot of migration. Using simulations, Leaché et al 2014 showed migration causes problems for species tree inference using the multispecies coalescent when migration is present but ignored. We re-analyze this simulated data to explore DENIM’s performance, and demonstrate substantial improvements over *BEAST. We also re-analyze an empirical data set. [isolation-with-migration; incomplete lineage sorting; multispecies coalescent; species tree; phylogenetic analysis; Bayesian; Markov chain Monte Carlo]


2018 ◽  
Author(s):  
Andrinajoro R Rakotoarivelo ◽  
Yoshan Moodley

Background. The bushbuck, Tragelaphus scriptus, is the most widespread and ecologically diverse ungulate species complex within the spiral-horned antelopes, occurring in approximately 73% of the total land area of sub-Saharan Africa. This species was found to consist of two genetically divergent lineages based on the mitochondrial (mt)DNA control region. One lineage inhabited the north-western half of the African continent (T. scriptus) while the other lineage (T. sylvaticus) was found in the south-eastern half. The complex was also found to comprise an unprecedented example of 23 phylogenetically distinct groups (‘ecotypes’), with montane and desert phenotypes potentially resulting from convergent evolution. The current study aim to test hypotheses regarding historical demography and adaptation of bushbuck using a higher-resolution framework, with faster evolving nuclear markers(MGF, PRKCI, SPTBN, and THY) as well as three further mitochondrial markers (cytochrome b, 12S rRNA, and 16S rRNA). Methods. Genealogies were reconstructed for the nuclear and mitochondrial data sets and for each gene independently to test the non-monphyly of the bushbuck complexe in a multi loci framework. In addition, we reconstruct the phylogeographic history of the bushbuck complex by a Bayesian discrete phylogeographic approach of our nucDNA data set to investigate its geographic diffusion and ancestral sequence location. Results. We uncovered two evolutionarily divergent lineages and geographically restricted lineages (Sylvaticus and Scriptus) of bushbuck using phylogenetics. Molecular dating indicates that these lineages last shared a common ancestor ∼2.54 million years ago. Summary statistics and analysis of the frequency distributions of DNA polymorphisms do not have any support for expanding population. Both BSPs and EBSPs indicate that the Scriptus and Sylvaticus lineages have remained relatively stable during the last 225-450Kya. Discussion. Both nucDNA and mtDNA support previously findings of two genetically divergent Sylvaticus and Scriptus lineages, despite them coming into secondary contact in several geographic regions. The three mtDNA loci confirmed 15 of the previously defined ecotypes, including those with convergent phenotypes. However, the nuclear tree showed less phylogenetic resolution at the more derived parts of the genealogy, possibly due to incomplete lineage sorting of the slower evolving nuclear genome. The only exception to this was the montane ecotype meneliki of the Ethiopian highlands, which formed a monophyletic group at three of the four nucDNA loci. The independent evolution of this group relative to phenotypically similar montane ecotypes in Africa confirm previously suggestions of convergence within the bushbuck complex.


2019 ◽  
Author(s):  
Zhen Cao ◽  
Xinhao Liu ◽  
Huw A. Ogilvie ◽  
Zhi Yan ◽  
Luay Nakhleh

AbstractPhylogenetic networks extend trees to enable simultaneous modeling of both vertical and horizontal evolutionary processes. PhyloNet is a software package that has been under constant development for over 10 years and includes a wide array of functionalities for inferring and analyzing phylogenetic networks. These functionalities differ in terms of the input data they require, the criteria and models they employ, and the types of information they allow to infer about the networks beyond their topologies. Furthermore, PhyloNet includes functionalities for simulating synthetic data on phylogenetic networks, quantifying the topological differences between phylogenetic networks, and evaluating evolutionary hypotheses given in the form of phylogenetic networks.In this paper, we use a simulated data set to illustrate the use of several of PhyloNet’s functionalities and make recommendations on how to analyze data sets and interpret the results when using these functionalities. All inference methods that we illustrate are incomplete lineage sorting (ILS) aware; that is, they account for the potential of ILS in the data while inferring the phylogenetic network. While the models do not include gene duplication and loss, we discuss how the methods can be used to analyze data in the presence of polyploidy.The concept of species is irrelevant for the computational analyses enabled by PhyloNet in that species-individuals mappings are user-defined. Consequently, none of the functionalities in PhyloNet deals with the task of species delimitation. In this sense, the data being analyzed could come from different individuals within a single species, in which case population structure along with potential gene flow is inferred (assuming the data has sufficient signal), or from different individuals sampled from different species, in which case the species phylogeny is being inferred.


Sign in / Sign up

Export Citation Format

Share Document