scholarly journals Calibrating the taxonomy of a megadiverse insect family: 3000 DNA barcodes from geometrid type specimens (Lepidoptera, Geometridae)

Genome ◽  
2016 ◽  
Vol 59 (9) ◽  
pp. 671-684 ◽  
Author(s):  
Axel Hausmann ◽  
Scott E. Miller ◽  
Jeremy D. Holloway ◽  
Jeremy R. deWaard ◽  
David Pollock ◽  
...  

It is essential that any DNA barcode reference library be based upon correctly identified specimens. The Barcode of Life Data Systems (BOLD) requires information such as images, geo-referencing, and details on the museum holding the voucher specimen for each barcode record to aid recognition of potential misidentifications. Nevertheless, there are misidentifications and incomplete identifications (e.g., to a genus or family) on BOLD, mainly for species from tropical regions. Unfortunately, experts are often unavailable to correct taxonomic assignments due to time constraints and the lack of specialists for many groups and regions. However, considerable progress could be made if barcode records were available for all type specimens. As a result of recent improvements in analytical protocols, it is now possible to recover barcode sequences from museum specimens that date to the start of taxonomic work in the 18th century. The present study discusses success in the recovery of DNA barcode sequences from 2805 type specimens of geometrid moths which represent 1965 species, corresponding to about 9% of the 23 000 described species in this family worldwide and including 1875 taxa represented by name-bearing types. Sequencing success was high (73% of specimens), even for specimens that were more than a century old. Several case studies are discussed to show the efficiency, reliability, and sustainability of this approach.

Author(s):  
Takeru Nakazato

DNA barcoding technology has become employed widely for biodiversity and molecular biology researchers to identify species and analyze their phylogeny. Recently, DNA metabarcoding and environmental DNA (eDNA) technology have developed by expanding the concept of DNA barcoding. These techniques analyze the diversity and quantity of organisms within an environment by detecting biogenic DNA in water and soil. It is particularly popular for monitoring fish species living in rivers and lakes (Takahara et al. 2012). BOLD Systems (Barcode of Life Database systems, Ratnasingham and Hebert 2007) is a database for DNA barcoding, archiving 8.5 million of barcodes (as of August 2020) along with the voucher specimen, from which the DNA barcode sequence is derived, including taxonomy, collected country, and museum vouchered as metadata (e.g. https://www.boldsystems.org/index.php/Public_RecordView?processid=TRIBS054-16). Also, many barcoding data are submitted to GenBank (Sayers et al. 2020), which is a database for DNA sequences managed by NCBI (National Center for Biotechnology Information, US). The number of the records of DNA barcodes, i.e. COI (cytochrome c oxidase I) gene for animal, has grown significantly (Porter and Hajibabaei 2018). BOLD imports DNA barcoding data from GenBank, and lots of DNA barcoding data in GenBank are also assigned BOLD IDs. However, we have to refer to both BOLD and GenBank data when performing DNA barcoding. I have previously investigated the registration of DNA barcoding data in GenBank, especially the association with BOLD, using insects and flowering plants as examples (Nakazato 2019). Here, I surveyed the number of species covered by BOLD and GenBank. I used fish data as an example because eDNA research is particularly focused on fish. I downloaded all GenBank files for vertebrates from NCBI FTP (File Transfer Protocol) sites (as of November 2019). Of the GenBank fish entries, 86,958 (7.3%) were assigned BOLD identifiers (IDs). The NCBI taxonomy database has registrations for 39,127 species of fish, and 20,987 scientific names at the species level (i.e., excluding names that included sp., cf. or aff.). GenBank entries with BOLD IDs covered 11,784 species (30.1%) and 8,665 species-level names (41.3%). I also obtained whole "specimens and sequences combined data" for fish from BOLD systems (as of November 2019). In the BOLD, there are 273,426 entries that are registered as fish. Of these entries, 211,589 BOLD entries were assigned GenBank IDs, i.e. with values in “genbank_accession” column, and 121,748 entries were imported from GenBank, i.e. with "Mined from GenBank, NCBI" description in "institution_storing" column. The BOLD data covered 18,952 fish species and 15,063 species-level names, but 35,500 entries were assigned no species-level names and 22,123 entries were not even filled with family-level names. At the species level, 8,067 names co-occurred in GenBank and BOLD, with 6,997 BOLD-specific names and 599 GenBank-specific names. GenBank has 425,732 fish entries with voucher IDs, of which 340,386 were not assigned a BOLD ID. Of these 340,386 entries, 43,872 entries are registrations for COI genes, which could be candidates for DNA barcodes. These candidates include 4,201 species that are not included in BOLD, thus adding these data will enable us to identify 19,863 fish to the species level. For researchers, it would be very useful if both BOLD and GenBank DNA barcoding data could be searched in one place. For this purpose, it is necessary to integrate data from the two databases. A lot of biodiversity data are recorded based on the Darwin Core standard while DNA sequencing data are sometimes integrated or cross-linked by RDF (Resource Description Framework). It may not be technically difficult to integrate these data, but the species data referenced differ from the EoL (The Encyclopedia of Life) for BOLD and the NCBI taxonomy for GenBank, and the differences in taxonomic systems make it difficult to match by scientific name description. GenBank has fields for the latitude and longitude of the specimens sampled, and Porter and Hajibabaei 2018 argue that this information should be enhanced. However, this information may be better described in the specimen and occurrence databases. The integration of barcoding data with the specimen and occurrence data will solve these problems. Most importantly, it will save the researcher from having to register the same information in multiple databases. In the field of biodiversity, only DNA barcode sequences may have been focused on and used as gene sequences. The museomics community regards museum-preserved specimens as rich resources for DNA studies because their biodiversity information can accompany the extraction and analysis of their DNA (Nakazato 2018). GenBank is useful for biodiversity studies due to its low rate of mislabelling (Leray et al. 2019). In the future, we will be working with a variety of DNA, including genomes from museum specimens as well as DNA barcoding. This will require more integrated use of biodiversity information and DNA sequence data. This integration is also of interest to molecular biologists and bioinformaticians.


2021 ◽  
Vol 38 ◽  
pp. 00087
Author(s):  
Elena Nikitina ◽  
Abdurashid Rakhmatov

The species level diversity is the reference unit for biodiversity accounting, should be systematized and include full information about the species. Reliable identification of any species is critical for a large-scale biodiversity monitoring and conservation. A DNA barcode is a DNA sequence that identifies a species by comparing the sequence of an unknown species with barcodes of a known species sequence database. Accurate identification of important plants is essential for their conservation, inventory. The species diversity assessing exampled on the subtribe Nepetinae (Lamiaceae) representatives, growing in Uzbekistan is given, using DNA barcoding method. The study was aimed to identify indigenous important plants with the nuclear (ITS) and plastid (matK, rbcL, trnL-F) genomes. This work demonstrates the phylogenetic relationships of some genera within the subtribe Nepetinae Coss. & Germ. (Lamiaceae), based on ITS locus gene. All results indicate that the DNA barcoding tool can be successfully used to reliably identify important plants, to inventory the botanical resources of Uzbekistan and to create a reference library of DNA barcodes. So, the combination of three-four locus gene is a good candidate for this approach.


2020 ◽  
Vol 367 (17) ◽  
Author(s):  
Mengyan Liu ◽  
Yi Zhao ◽  
Yuzhe Sun ◽  
Ping Wu ◽  
Shiliang Zhou ◽  
...  

ABSTRACT The presence of diatoms in victim's internal organs has been regarded as a gold biological evidence of drowning. The idea becomes true at the advent of DNA metabarcoding. Unfortunately, the DNA barcode of diatoms are far from being applicable due to neither consensus on the barcode and nor reliable reference library.In this study we tested 23 pairs of primers, including two new primer pairs, Baci18S (V4 of 18S) and BacirbcL (central region of rbcL), for amplifying fragments of 16S/18S, 23S/28S, COI, ITS and rbcL. A total of five pairs of primers performed satisfactory for diatoms. We used three of them, 18S605 (V2 + V3 of 18S), Baci18S and BacirbcL, to barcode four water samples using next generation sequencing platform. The results showed that these primers worked well for NGS metabarcoding of diatoms. We suggest that 18S605, Baci18S and BacirbcL be barcodes of diatoms and the corresponding primer pairs be used. Considering a quite high proportion of sequences deposited in GenBank were mislabeled, the most urgent task for DNA barcoding of diatoms is to create standard sequences using correctly identified specimens, ideally type specimens.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Monica R. Young ◽  
Jeremy R. deWaard ◽  
Paul D. N. Hebert

AbstractAlthough mites (Acari) are abundant in many terrestrial and freshwater ecosystems, their diversity is poorly understood. Since most mite species can be distinguished by variation in the DNA barcode region of cytochrome c oxidase I, the Barcode Index Number (BIN) system provides a reliable species proxy that facilitates large-scale surveys. Such analysis reveals many new BINs that can only be identified as Acari until they are examined by a taxonomic specialist. This study demonstrates that the Barcode of Life Datasystem’s identification engine (BOLD ID) generally delivers correct ordinal and family assignments from both full-length DNA barcodes and their truncated versions gathered in metabarcoding studies. This result was demonstrated by examining BOLD ID’s capacity to assign 7021 mite BINs to their correct order (4) and family (189). Identification success improved with sequence length and taxon coverage but varied among orders indicating the need for lineage-specific thresholds. A strict sequence similarity threshold (86.6%) prevented all ordinal misassignments and allowed the identification of 78.6% of the 7021 BINs. However, higher thresholds were required to eliminate family misassignments for Sarcoptiformes (89.9%), and Trombidiformes (91.4%), consequently reducing the proportion of BINs identified to 68.6%. Lineages with low barcode coverage in the reference library should be prioritized for barcode library expansion to improve assignment success.


ZooKeys ◽  
2018 ◽  
Vol 759 ◽  
pp. 57-80 ◽  
Author(s):  
Michael J. Raupach ◽  
Karsten Hannig ◽  
Jérôme Morinière ◽  
Lars Hendrich

The genus Amara Bonelli, 1810 is a very speciose and taxonomically difficult genus of the Carabidae. The identification of many of the species is accomplished with considerable difficulty, in particular for females and immature stages. In this study the effectiveness of DNA barcoding, the most popular method for molecular species identification, was examined to discriminate various species of this genus from Central Europe. DNA barcodes from 690 individuals and 47 species were analysed, including sequences from previous studies and more than 350 newly generated DNA barcodes. Our analysis revealed unique BINs for 38 species (81%). Interspecific K2P distances below 2.2% were found for three species pairs and one species trio, including haplotype sharing between Amaraalpina/Amaratorrida and Amaracommunis/Amaraconvexior/Amaramakolskii. This study represents another step in generating an extensive reference library of DNA barcodes for carabids, highly valuable bioindicators for characterizing disturbances in various habitats.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11157
Author(s):  
Jacopo D’Ercole ◽  
Vlad Dincă ◽  
Paul A. Opler ◽  
Norbert Kondla ◽  
Christian Schmidt ◽  
...  

Although the butterflies of North America have received considerable taxonomic attention, overlooked species and instances of hybridization continue to be revealed. The present study assembles a DNA barcode reference library for this fauna to identify groups whose patterns of sequence variation suggest the need for further taxonomic study. Based on 14,626 records from 814 species, DNA barcodes were obtained for 96% of the fauna. The maximum intraspecific distance averaged 1/4 the minimum distance to the nearest neighbor, producing a barcode gap in 76% of the species. Most species (80%) were monophyletic, the others were para- or polyphyletic. Although 15% of currently recognized species shared barcodes, the incidence of such taxa was far higher in regions exposed to Pleistocene glaciations than in those that were ice-free. Nearly 10% of species displayed high intraspecific variation (>2.5%), suggesting the need for further investigation to assess potential cryptic diversity. Aside from aiding the identification of all life stages of North American butterflies, the reference library has provided new perspectives on the incidence of both cryptic and potentially over-split species, setting the stage for future studies that can further explore the evolutionary dynamics of this group.


2019 ◽  
Vol 7 ◽  
Author(s):  
Gunnhild Marthinsen ◽  
Siri Rui ◽  
Einar Timdal

DNA barcodes are increasingly being used for species identification amongst the lichenised fungi. This paper presents a dataset aiming to provide an authoritative DNA barcode sequence library for a wide array of Nordic lichens. We present 1324 DNA barcode sequences (nrITS) for 507 species in 175 genera and 25 orders. Thirty-eight species are new to GenBank and, for 25 additional species, ITS sequences are here presented for the first time. The dataset covers 20–21% of the Nordic lichenised species. Barcode gap analyses are given and discussed for the three genera Cladonia, Ramalina and Umbilicaria. The new combination Bryobilimbia fissuriseda (Poelt) Timdal, Marthinsen & Rui is proposed for Mycobilimbia fissuriseda and Nordic material of the species, currently referred to as Pseudocyphellaria crocata and Psoroma tenue ssp. boreale, are shown to belong in Pseudocyphellaria citrina and Psoroma cinnamomeum, respectively.


Genome ◽  
2017 ◽  
Vol 60 (3) ◽  
pp. 248-259 ◽  
Author(s):  
Derek S. Sikes ◽  
Matthew Bowser ◽  
John M. Morton ◽  
Casey Bickford ◽  
Sarah Meierotto ◽  
...  

Climate change may result in ecological futures with novel species assemblages, trophic mismatch, and mass extinction. Alaska has a limited taxonomic workforce to address these changes. We are building a DNA barcode library to facilitate a metabarcoding approach to monitoring non-marine arthropods. Working with the Canadian Centre for DNA Barcoding, we obtained DNA barcodes from recently collected and authoritatively identified specimens in the University of Alaska Museum (UAM) Insect Collection and the Kenai National Wildlife Refuge collection. We submitted tissues from 4776 specimens, of which 81% yielded DNA barcodes representing 1662 species and 1788 Barcode Index Numbers (BINs), of primarily terrestrial, large-bodied arthropods. This represents 84% of the species available for DNA barcoding in the UAM Insect Collection. There are now 4020 Alaskan arthropod species represented by DNA barcodes, after including all records in Barcode of Life Data Systems (BOLD) of species that occur in Alaska — i.e., 48.5% of the 8277 Alaskan, non-marine-arthropod, named species have associated DNA barcodes. An assessment of the identification power of the library in its current state yielded fewer species-level identifications than expected, but the results were not discouraging. We believe we are the first to deliberately begin development of a DNA barcode library of the entire arthropod fauna for a North American state or province. Although far from complete, this library will become increasingly valuable as more species are added and costs to obtain DNA sequences fall.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5013 ◽  
Author(s):  
Lijuan Wang ◽  
Zhihao Wu ◽  
Mengxia Liu ◽  
Wei Liu ◽  
Wenxi Zhao ◽  
...  

Rongcheng Bay is a coastal bay of the Northern Yellow Sea, China. To investigate and monitor the fish resources in Rongcheng Bay, 187 specimens from 41 different species belonging to 28 families in nine orders were DNA-barcoded using the mitochondrial cytochrome c oxidase subunit I gene (COI). Most of the fish species could be discriminated using this COI sequence with the exception of Cynoglossus joyneri and Cynoglossus lighti. The average GC% content of the 41 fish species was 47.3%. The average Kimura 2-parameter genetic distances within the species, genera, families, and orders were 0.21%, 5.28%, 21.30%, and 23.63%, respectively. Our results confirmed that the use of combined morphological and DNA barcoding identification methods facilitated fish species identification in Rongcheng Bay, and also established a reliable DNA barcode reference library for these fish. DNA barcodes will contribute to future efforts to achieve better monitoring, conservation, and management of fisheries in this area.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12325
Author(s):  
Lu Gong ◽  
Danchun Zhang ◽  
Xiaoxia Ding ◽  
Juan Huang ◽  
Wan Guan ◽  
...  

Background Amomum villosum Lour. is the plant that produces the famous traditional Chinese medicine Amomi Fructus. Frequent habitat destruction seriously threatens A. villosum germplasm resources. Genetic diversity is very important to the optimization of germplasm resources and population protection, but the range of inherited traits within A. villosum is unclear. In this study, we analyzed the genetic diversity and genetic structures of A. villosum populations in Guangdong and constructed a local reference DNA barcode library as a resource for conservation efforts. Methods DNA barcoding and Inter-Simple Sequence Repeat (ISSR) markers were used to investigate the population genetics of A. villosum. Five universal DNA barcodes were amplified and used in the construction of a DNA barcode reference library. Parameters including percentage of polymorphic sites (PPB), number of alleles (Na), effective number of alleles (Ne), Nei’s gene diversity index (H), and Shannon’s polymorphism information index (I) were calculated for the assessment of genetic diversity. Genetic structure was revealed by measuring Nei’s gene differentiation coefficient (Gst), total population genetic diversity (Ht), intra-group genetic diversity (Hs), and gene flow (Nm). Analysis of molecular variance (AMOVA), Mantel tests, unweighted pair-group method with arithmetic mean (UPGMA) dendrogram, and principal co-ordinates (PCoA) analysis were used to elucidate the genetic differentiation and relationship among populations. Results A total of 531 sequences were obtained from the five DNA barcodes with no variable sites from any of the barcode sequences. A total of 66 ISSR bands were generated from A. villosum populations using the selected six ISSR primers; 56 bands, 84.85% for all the seven A. villosum populations were polymorphic. The A. villosum populations showed high genetic diversity (H = 0.3281, I = 0.4895), whereas the gene flow was weak (Nm = 0.6143). Gst (0.4487) and AMOVA analysis indicated that there is obvious genetic differentiation amongA. villosum populations and more genetic variations existed within each population. The genetic relationship of each population was relatively close as the genetic distances were between 0.0844 and 0.3347.


Sign in / Sign up

Export Citation Format

Share Document