scholarly journals A reference library for the identification of Canadian invertebrates: 1.5 million DNA barcodes, voucher specimens, and genomic samples

2019 ◽  
Author(s):  
Jeremy R. deWaard ◽  
Sujeevan Ratnasingham ◽  
Evgeny V. Zakharov ◽  
Alex V. Borisenko ◽  
Dirk Steinke ◽  
...  

AbstractThe reliable taxonomic identification of organisms through DNA sequence data requires a well parameterized library of curated reference sequences. However, it is estimated that just 15% of described animal species are represented in public sequence repositories. To begin to address this deficiency, we provide DNA barcodes for 1,500,003 animal specimens collected from 23 terrestrial and aquatic ecozones at sites across Canada, a nation that comprises 7% of the planet’s land surface. In total, 14 phyla, 43 classes, 163 orders, 1123 families, 6186 genera, and 64,264 Barcode Index Numbers (BINs; a proxy for species) are represented. Species-level taxonomy was available for 38% of the specimens, but higher proportions were assigned to a genus (69.5%) and a family (99.9%). Voucher specimens and DNA extracts are archived at the Centre for Biodiversity Genomics where they are available for further research. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, and the Global Genome Biodiversity Network Data Portal.

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Jeremy R. deWaard ◽  
Sujeevan Ratnasingham ◽  
Evgeny V. Zakharov ◽  
Alex V. Borisenko ◽  
Dirk Steinke ◽  
...  

AbstractThe reliable taxonomic identification of organisms through DNA sequence data requires a well parameterized library of curated reference sequences. However, it is estimated that just 15% of described animal species are represented in public sequence repositories. To begin to address this deficiency, we provide DNA barcodes for 1,500,003 animal specimens collected from 23 terrestrial and aquatic ecozones at sites across Canada, a nation that comprises 7% of the planet’s land surface. In total, 14 phyla, 43 classes, 163 orders, 1123 families, 6186 genera, and 64,264 Barcode Index Numbers (BINs; a proxy for species) are represented. Species-level taxonomy was available for 38% of the specimens, but higher proportions were assigned to a genus (69.5%) and a family (99.9%). Voucher specimens and DNA extracts are archived at the Centre for Biodiversity Genomics where they are available for further research. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, and the Global Genome Biodiversity Network Data Portal.


2021 ◽  
Author(s):  
Manuela Mejía Estrada ◽  
Luz Fernanda Jiménez-Segura ◽  
Iván Soto Calderón

The Barcoding was proposed motivated by the mismatch between the low number of taxonomists that contrasts with the large number of species, the method requires the construction of reference collections of DNA sequences that represent existing biodiversity. Freshwater fishes are key indicators for understanding biogeography around the world. Colombia with 1610 species of freshwater fishes is the second richest country in the world in this group. However, genetic information of the species continues to be limited, the contribution to a reference library of DNA barcodes for Colombian freshwater fishes highlights the importance of biological collections and seeks to strengthen inventories and taxonomy of such collections in future studies. This dataset contributes to the knowledge on the DNA barcodes and occurrence records of 96 species of Freshwater fishes from Colombia. The species represented in this dataset correspond to an addition to BOLD public databases of 39 species. Forty-nine specimens were collected in Atrato bassin and 708 in Magdalena-Cauca bassin during the period of 2010 to 2020, two species (Loricariichthys brunneus and Poecilia sphenops) are considered exotic to the Atrato, Cauca and Magdalena basins and four species (Oncorhynchu mykiss, Oreochromis niloticus, Parachromis friedrichsthalii and Xiphophorus helleri) are exotic to Colombian hydrogeographic regions. All specimens are deposited in the CIUA collection at University of Antioquia and have their DNA barcodes made publicly available in the Barcode of Life Data System (BOLD) online database and the distribution dataset can be freely accessed through the Global Biodiversity Information Facility (GBIF).


2021 ◽  
Vol 9 ◽  
Author(s):  
Leidys Murillo-Ramos ◽  
Pasi Sihvonen ◽  
Gunnar Brehm ◽  
Indiana Ríos-Malaver ◽  
Niklas Wahlberg

Molecular DNA sequence data allow unprecedented advances in biodiversity assessments, monitoring schemes and taxonomic works, particularly in poorly-explored areas. They allow, for instance, the sorting of material rapidly into operational taxonomic units (such as BINs - Barcode Index Numbers), sequences can be subject to diverse analyses and, with linked metadata and physical vouchers, they can be examined further by experts. However, a prerequisite for their exploitation is the construction of reference libraries of DNA sequences that represent the existing biodiversity. To achieve these goals for Geometridae (Lepidoptera) moths in Colombia, expeditions were carried out to 26 localities in the northern part of the country in 2015–2019. The aim was to collect specimens and sequence their DNA barcodes and to record a fraction of the species richness and occurrences in one of the most biodiversity-rich countries. These data are the beginning of an identification guide to Colombian geometrid moths, whose identities are currently often provisional only, being morpho species or operational taxonomic units (OTUs). Prior to the current dataset, 99 Geometridae sequences forming 44 BINs from Colombia were publicly available on the Barcode of Life Data System (BOLD), covering 20 species only. We enrich the Colombian Geometridae database significantly by including DNA barcodes, two nuclear markers, photos of vouchers and georeferenced occurrences of 281 specimens of geometrid moths from different localities. These specimens are classified into 80 genera. Analytical tools on BOLD clustered 157 of the mentioned sequences to existing BINs identified to species level, identified earlier by experts. Another 115 were assigned to BINs that were identified to genus or tribe level only. Eleven specimens did not match any existing BIN on BOLD and are, therefore, new additions to the database. It is likely that many BINs represent undescribed species. Nine short sequences (< 500bp) were not assigned to BINs, but identified to the lowest taxonomic category by expert taxonomists and with comparisons of type material photos. The released new genetic information will help to further progress the systematics of Geometridae. An illustrated catalogue of all new records allows validation of our identifications; it is also the first document of this kind for Colombian Geometridae. All specimens are deposited at the Museo de Zoología of Universidad de Sucre (MZUS), North Colombia. DNA BINs are reported in this study through dx.doi.org/10.5883/DS-GEOCO, the species occurrences are available on SIB Colombia https://sibcolombia.net/ and the Global Biodiversity Information Facility (GBIF) https://www.gbif.org/ through https://doi.org/10.15472/ucfmkh.


2018 ◽  
Vol 2 ◽  
pp. e26060
Author(s):  
Pamela Soltis

Digitized natural history data are enabling a broad range of innovative studies of biodiversity. Large-scale data aggregators such as Global Biodiversity Information facility (GBIF) and Integrated Digitized Biocollections (iDigBio) provide easy, global access to millions of specimen records contributed by thousands of collections. A developing community of eager users of specimen data – whether locality, image, trait, etc. – is perhaps unaware of the effort and resources required to curate specimens, digitize information, capture images, mobilize records, serve the data, and maintain the infrastructure (human and cyber) to support all of these activities. Tracking of specimen information throughout the research process is needed to provide appropriate attribution to the institutions and staff that have supplied and served the records. Such tracking may also allow for annotation and comment on particular records or collections by the global community. Detailed data tracking is also required for open, reproducible science. Despite growing recognition of the value and need for thorough data tracking, both technical and sociological challenges continue to impede progress. In this talk, I will present a brief vision of how application of a DOI to each iteration of a data set in a typical research project could provide attribution to the provider, opportunity for comment and annotation of records, and the foundation for reproducible science based on natural history specimen records. Sociological change – such as journal requirements for data deposition of all iterations of a data set – can be accomplished using community meetings and workshops, along with editorial efforts, as were applied to DNA sequence data two decades ago.


2021 ◽  
Author(s):  
Leidys Murillo-Ramos ◽  
Pasi Sihvonen ◽  
Gunnar Brehm ◽  
Indiana Ríos-Malaver ◽  
Niklas Wahlberg

Molecular DNA sequence data allow unprecedented advances in biodiversity assessments, monitoring schemes, and taxonomic works, particularly in poorly explored areas. They allow, for instance, the sorting of material rapidly into operational taxonomic units (such as BINs - Barcode Index Numbers), sequences can be subject to diverse analyses, and with linked metadata and physical vouchers they can be examined further by experts. However, a prerequisite for their exploitation is the construction of reference libraries of DNA sequences that represent the existing biodiversity. To achieve these goals for Geometridae (Lepidoptera) moths in Colombia, expeditions were carried out to 26 localities in the northern part of the country in 2015–2019. The aim was to collect specimens and sequence their DNA barcodes, and to record a fraction of the species richness and occurrences in one of the most biodiversity-rich countries. These data are the beginnings of an identification guide to Colombian geometrid moths, whose identities are currently often provisional only, being morpho species or operational taxonomic units (OTUs). Prior to the current dataset, 99 Geometridae sequences forming 44 BINs from Colombia were publicly available on the Barcode of Life Data System (BOLD), covering 20 species only. We enrich the Colombian Geometridae database significantly by including DNA barcodes, two nuclear markers, photos of vouchers, and georeferenced occurrences of 281 specimens of geometrid moths from different localities. These specimens are classified into 80 genera. Analytical tools on BOLD clustered 157 of the mentioned sequences to existing species level BINs, identified earlier by experts. Another 115 were assigned to BINs that were identified to genus or tribe level only. 11 specimens did not match any existing BIN on BOLD, and are therefore new additions to the database. It is likely that many BINs represent undescribed species. Nine short sequences (<500bp) were not assigned to BINs but identified to the lowest taxonomic category by expert taxonomists and with comparisons of type material photos. The released new genetic information will help to further progress the systematics of Geometridae. An illustrated catalogue of all new records allows validation of our identifications; it is also the first document of this kind for Colombian Geometridae. All specimens are deposited at the Museo de Zoología of Universidad de Sucre (MZUS), North Colombia. DNA BINs are reported in this study, the species occurrences are available on SIB Colombia https://sibcolombia.net/ and the Global Biodiversity Information Facility (GBIF) https://www.gbif.org/ through https://doi.org/10.15472/ucfmkh.


2006 ◽  
Vol 38 (6) ◽  
pp. 577-585 ◽  
Author(s):  
Georg BRUNAUER ◽  
Armin HAGER ◽  
Wolf Dietrich KRAUTGARTNER ◽  
Roman TÜRK ◽  
Elfie STOCKER-WÖRGÖTTER

Culture experiments that trigger the axenically grown mycobionts of Lecanora rupicola to produce the polyketide chemosyndrome typical of the naturally grown lichen are reported. This chemosyndrome comprises lecanoric, haematommic and orsellinic acids, sordidone, eugenitol and atranorin, all of which were hardly produced under standard culture conditions. The only exception was arthothelin that was only present in the voucher specimen. It has been shown that almost the complete acetyl-polymalonyl-pathway leading to depsides and chromones can be induced in culture, but apparently not the xanthones. The mycobiont was also successfully re-synthesized with its original photobiont, as confirmed by Scanning Electron Microscope studies (SEM). Cultures of the resynthesised lichen biosynthesized additional satellite substances, which were not detected either in the voucher specimens or in the aposymbiontically (without the photobiont) grown mycobiont cultures. The identity of cultured mycobionts of L. rupicola was confirmed by comparing ITS-DNA-sequence data from the original lichen with publicly available (GeneBank) sequences of that species.


2020 ◽  
Vol 8 ◽  
Author(s):  
Sonia Ferreira ◽  
Rui Andrade ◽  
Ana Gonçalves ◽  
Pedro Sousa ◽  
Joana Paupério ◽  
...  

The InBIO Barcoding Initiative (IBI) Diptera 01 dataset contains records of 203 specimens of Diptera. All specimens have been morphologically identified to species level, and belong to 154 species in total. The species represented in this dataset correspond to about 10% of continental Portugal dipteran species diversity. All specimens were collected north of the Tagus river in Portugal. Sampling took place from 2014 to 2018, and specimens are deposited in the IBI collection at CIBIO, Research Center in Biodiversity and Genetic Resources. This dataset contributes to the knowledge on the DNA barcodes and distribution of 154 species of Diptera from Portugal and is the first of the planned IBI database public releases, which will make available genetic and distribution data for a series of taxa. All specimens have their DNA barcodes made publicly available in the Barcode of Life Data System (BOLD) online database and the distribution dataset can be freely accessed through the Global Biodiversity Information Facility (GBIF).


2018 ◽  
Vol 2 ◽  
pp. e26369
Author(s):  
Michael Trizna

As rapid advances in sequencing technology result in more branches of the tree of life being illuminated, there has actually been a decrease in the percentage of sequence records that are backed by voucher specimens Trizna 2018b. The good news is that there are tools Trizna (2017), NCBI (2005), Biocode LLC (2014) to enable well-databased museum vouchers to automatically validate and format specimen and collection metadata for high quality sequence records. Another problem is that there are millions of existing sequence records that are known to contain either incorrect or incomplete specimen data. I will show an end-to-end example of sequencing specimens from a museum, depositing their sequence records in NCBI's (National Center for Biotechnology Information) GenBank database, and then providing updates to GenBank as the museum database revises identifications. I will also talk about linking records from specimen databases as well. Over one million records in the Global Biodiversity Information Facility (GBIF) Trizna (2018a) contain a value in the Darwin Core term "associatedSequences", and I will examine what is currently contained in these entries, and how best to format them to ensure that a tight connection is made to sequence records.


2021 ◽  
Vol 9 ◽  
Author(s):  
Sónia Ferreira ◽  
Pjotr Oosterbroek ◽  
Jaroslav Starý ◽  
Pedro Sousa ◽  
Vanessa Mata ◽  
...  

The InBIO Barcoding Initiative (IBI) Diptera 02 dataset contains records of 412 crane fly specimens belonging to the Diptera families: Limoniidae, Pediciidae and Tipulidae. This dataset is the second release by IBI on Diptera and it greatly increases the knowledge on the DNA barcodes and distribution of crane flies from Portugal. All specimens were collected in Portugal, including six specimens from the Azores and Madeira archipelagos. Sampling took place from 2003 to 2019. Specimens have been morphologically identified to species level by taxonomists and belong to 83 species in total. The species, represented in this dataset, correspond to about 55% of all the crane fly species known from Portugal and 22% of crane fly species known from the Iberian Peninsula. All DNA extractions and most specimens are deposited in the IBI collection at CIBIO, Research Center in Biodiversity and Genetic Resources. Fifty-three species were new additions to the Barcode of Life Data System (BOLD), with another 18 species' barcodes added from under-represented species in BOLD. Furthermore, the submitted sequences were found to cluster in 88 BINs, 54 of which were new to BOLD. All specimens have their DNA barcodes publicly accessible through BOLD online database and its collection data can be accessed through the Global Biodiversity Information Facility (GBIF). One species, Gonomyia tenella (Limoniidae), is recorded for the first time from Portugal, raising the number of crane flies recorded in the country to 145 species.


Sign in / Sign up

Export Citation Format

Share Document