scholarly journals Putting your Finger upon the Simplest Data

2018 ◽  
Vol 2 ◽  
pp. e26300 ◽  
Author(s):  
Arturo Ariño

Over the past decades, digitization endeavors across many institutions holding natural history collections (NHCs) have multiplied with three broad aims: first, to facilitate collection management by moving existing analog catalogues into digital form; second, to efficiently document and inventory specimens in collections, including imaging them as taxonomical surrogates; and third, to enable discovery of, and access to, the resulting collection data. NHCs contain a unique wealth of potential knowledge in the form of primary biodiversity data records (PBR): at its most basic level, the “what, where and when” of occurrences of the specimens in the collections. But as T.S. Eliot famously said, “knowledge is invariably a matter of degree”. For such data to be transformed into digitally accessible knowledge (DAK) that is conducive to an understanding about how the natural world works, release of digitized data (the “this we know”) is necessary. At least two billion specimens are estimated to exist in NHCs already, but only a small fraction can be considered properly DAK: most have either not been digitized yet, or not released through a discovery facility. Digitizing is relatively costly as it often entails manually processing each specimen unit (e.g. a herbarium sheet, a pinned insect, or a vial full of invertebrates). How long could it take us to transform all NHCs into DAK? Can we keep up with the natural growth in collections? The Global Biodiversity Information Facility (GBIF) has become the de facto main index of PBR, both originated in NHCs or as field observations. Digitized NHC that are standards-compliant and can be connected to, or harvested by, GBIF, effectively become DAK. I have examined GBIF growth data looking for a pattern of DAK generation. I found that the rate of NHC-based PBR accrual is remarkably constant: the total DAK shows a strongly linear growth, as opposed to the exponential growth exhibited by cumulative observation data. Projecting the trend to the estimated holdings shoots the completion many decades ahead. In addition, digitized data appear to be taxonomically biased. Digitization efforts must therefore step up qualitatively in order to enable processing the backlog, let alone newly-acquired accessions, within one generation. Among several possible solutions, emerging, industrial-scale mass-digitization techniques may help harnessing this otherwise daunting task—but there’s also a risk that DAK becomes even more uneven across taxon groups because of the narrow application specificity of such techniques, thus potentially biasing our knowledge of nature.

2021 ◽  
Vol 9 ◽  
Author(s):  
Sónia Ferreira ◽  
Pjotr Oosterbroek ◽  
Jaroslav Starý ◽  
Pedro Sousa ◽  
Vanessa Mata ◽  
...  

The InBIO Barcoding Initiative (IBI) Diptera 02 dataset contains records of 412 crane fly specimens belonging to the Diptera families: Limoniidae, Pediciidae and Tipulidae. This dataset is the second release by IBI on Diptera and it greatly increases the knowledge on the DNA barcodes and distribution of crane flies from Portugal. All specimens were collected in Portugal, including six specimens from the Azores and Madeira archipelagos. Sampling took place from 2003 to 2019. Specimens have been morphologically identified to species level by taxonomists and belong to 83 species in total. The species, represented in this dataset, correspond to about 55% of all the crane fly species known from Portugal and 22% of crane fly species known from the Iberian Peninsula. All DNA extractions and most specimens are deposited in the IBI collection at CIBIO, Research Center in Biodiversity and Genetic Resources. Fifty-three species were new additions to the Barcode of Life Data System (BOLD), with another 18 species' barcodes added from under-represented species in BOLD. Furthermore, the submitted sequences were found to cluster in 88 BINs, 54 of which were new to BOLD. All specimens have their DNA barcodes publicly accessible through BOLD online database and its collection data can be accessed through the Global Biodiversity Information Facility (GBIF). One species, Gonomyia tenella (Limoniidae), is recorded for the first time from Portugal, raising the number of crane flies recorded in the country to 145 species.


Author(s):  
Peter Desmet ◽  
Stijn Van Hoey ◽  
Lien Reyserhove ◽  
Dimitri Brosens ◽  
Damiano Oldoni ◽  
...  

The Research Institute for Nature and Forest (INBO) is co-managing three biologging networks as part of a terrestrial and freshwater observatory for LifeWatch Belgium. The networks are a GPS tracking network for large birds, an acoustic receiver network for fish, and a camera trap network for mammals. As part of our mission at the Open science lab for biodiversity, we are publishing the machine observations these networks generate as standardized, open data. One of the challenges however, is finding the appropriate standards and platforms to do so. In this talk, we will present the three networks, the type of biologging data they collect and how we (plan to) standardize these to specific community standards and to Darwin Core (Wieczorek et al. 2012). Data from the bird tracking network have been published in 2014 as one of the first biologging datasets on the Global Biodiversity Information Facility (GBIF) (Stienen et al. 2014). We are now planning to upload the data to Movebank instead and contribute to a generic mapping between the Movebank format and Darwin Core. Data from the acoustic receiver network are being mapped using the Darwin Core guidelines proposed by the Machine Observations Interest Group of Biodiversity Information Standards (TDWG). Images generated by the camera trap network are managed in the annotation system Agouti, for which we plan to export the data in the Camera Trap Metadata Language (Forrester et al. 2016). We also aim to write a software package to deposit camera trap images and data on Zenodo and map the observation data to Darwin Core. We hope that our work will contribute to discussions and guidelines on how to best map biologging data to Darwin Core, which is one of the aims of the Machine Observations Interest Group of Biodiversity Information Standards (TDWG).


Author(s):  
Elie Tobi ◽  
Geovanne Aymar Nziengui Djiembi ◽  
Anna Feistner ◽  
Donald Midoko Iponga ◽  
Jean Felicien Liwouwou ◽  
...  

Language is a major barrier for researchers wanting to digitize and publish collection data in Africa. Despite being the fifth most spoken language on Earth and the second most common in Africa, resources in French about digitization, data management, and publishing are lacking. Furthermore, French-speaking regions of Africa (primarily Central/West Africa and Madagascar) host some of the highest biodiversity on the continent and therefore are of great importance to scientists and decision-makers. Without having representation in online portals like the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio), these important collections are effectively invisible. Producing relevant/applicable resources about digitization in French will help shine a light on these valuable natural history records and allow the data-holders in Africa to retain the autonomy of their collections. Awarded a GBIF-BID (Biodiversity Information for Development) grant in 2021, an international, multilingual network of partners has undertaken the important task of digitizing and mobilizing Gabon’s vertebrate collections. There are an estimated 13,500 vertebrate specimens housed in five institutions in different parts of Gabon. To date, the group has mobilized >4,600 vertebrate records to our recently launched Gabon Biodiversity Portal (https://gabonbiota.org/). The portal also hosts French guides for using Symbiota-based portals to manage, georeference, and publish natural history databases. These resources can provide much-needed guidance for other Francophone countries⁠—in Africa and beyond⁠—working to maximize the accessibility and value of their biodiversity collections.


Author(s):  
Joachim Holstein ◽  
Christoph L. Häuser

Die Global Biodiversity Information Facility (GBIF) wurde nach über dreijähriger Vorarbeit des Megascience Forum der OECD im Frühjahr 2001 mit dem Ziel gegründet, wissenschaftliche Daten und Informationen zur Biodiversität über des Internet frei verfügbar und zur besseren Nutzung zu verknüpfen. Im Rahmen einer weltweiten Forschungskooperation wird GBIF von derzeit 47 Staaten und 29 internationalen Organisationen als Mitgliedern getragen, die sich alle zur freien Bereitstellung digitaler Biodiversitätsdaten nach gemeinsamen Standards über eigene, dafür selbst einzurichtende Datenknoten verpflichtet haben. Das internationale Vorhaben wird durch einen Aufsichtsrat mit Vertretern aller Mitgliedsstaaten und –organisationen geleitet, dessen Arbeit durch mehrere Komitees und Ausschüsse unterstützt wird. Das seit 2002 in Kopenhagen, Dänemark, angesiedelte GBIF-Sekretariat betreibt den Aufbau des internationalen GBIF Portals (www.gbif.net) und unterstützt koordinierend die Aktivitäten der einzelnen Mitglieder, die sich auf vier Programmbereiche erstrecken: Standardisierung und Verknüpfung von Datenbanken (DADI), Digitalisierung von Daten zu Sammlungsobjekten (DIGIT), Katalog der bekannten Organismennamen (ECAT), sowie Ausbildung und Öffentlichkeitsarbeit (OCB). Für die deutsche Beteiligung an GBIF wurden mit Unterstützung der Bundesregierung (BMBF) sieben Datenknoten an verschiedenen Forschungsinstitutionen aufgebaut, deren Zuständigkeit sich auf unterschiedliche Organismengruppen erstreckt: 1. Insekten (Wirbellose 1) am Staatlichen Museum für Naturkunde Stuttgart; 2. terrestrische Wirbellose (Wirbellose 2) an der Zoologischen Staatssammlung München; 3. marine Wirbellose (Wirbellose 3) am Forschungsinstitut und Naturmuseum Senckenberg in Frankfurt/Main; 4. Wirbeltiere am Zoologischen Forschungsinstitut und Museum Alexander Koenig in Bonn; 5. Pflanzen am Botanischen Garten und Botanischen Museum Berlin; 6. Pilze an der Botanischen Staatssammlung München; 7. Mikroorganismen an der Deutschen Sammlung für Mikroorganismen und Zellkulturen in Braunschweig. Die aufgrund ihrer fachlich unterschiedlichen Ausrichtung innerhalb der einzelnen Knoten zur Erfassung von Sammlungsdaten verwendeten, verschiedenen Datenbankprogramme werden kurz angeführt.StichwörterBiodiversity information, international cooperation, internet, database, collection data, GBIF node.


2021 ◽  
Vol 14 (3) ◽  
pp. 1-14
Author(s):  
Krishna Kumar Thirukokaranam Chandrasekar ◽  
Emile Deman ◽  
Steven Verstockt

As people on average only spent 20 seconds(s) observing an artwork, they mostly miss a lot of informative details that are contained within it. As an example, the 75 different plants that can be found in the Ghent Altarpiece is something not a lot of people are aware of. Within this article, we present a methodology, based on cross-collection linking, to create awareness about the botanical imagery in Van Eyck’s masterpiece and to inform people about their region’s plant richness and diversity over time. As such, this article is a nice example of how the interdisciplinary fields of cultural heritage and botany can go hand in hand to facilitate its dissemination to the general public. The plants in the painting can be queried by their name or by a picture taken with a mobile device—a plant recognition app is used to evaluate the pictures taken from the plants. A study has also been performed to evaluate these apps and to select the most appropriate one for the collection of plants in the Ghent Alterpiece. Currently, we link the detected plants to herbaria, observation data, Global Biodiversity Information Facility plantinfo, and recent wikimedia commons pictures, but other links can also be easily integrated with the platform. Finally, we also studied nowadays plant observations (volunteered geographic information) in more detail and reveal which region currently has most of Van Eyck’s plants/flowers.


Author(s):  
Katharine Barker ◽  
Jonas Astrin ◽  
Gabriele Droege ◽  
Jonathan Coddington ◽  
Ole Seberg

Most successful research programs depend on easily accessible and standardized research infrastructures. Until recently, access to tissue or DNA samples with standardized metadata and of a sufficiently high quality, has been a major bottleneck for genomic research. The Global Geonome Biodiversity Network (GGBN) fills this critical gap by offering standardized, legal access to samples. Presently, GGBN’s core activity is enabling access to searchable DNA and tissue collections across natural history museums and botanic gardens. Activities are gradually being expanded to encompass all kinds of biodiversity biobanks such as culture collections, zoological gardens, aquaria, arboreta, and environmental biobanks. Broadly speaking, these collections all provide long-term storage and standardized public access to samples useful for molecular research. GGBN facilitates sample search and discovery for its distributed member collections through a single entry point. It stores standardized information on mostly geo-referenced, vouchered samples, their physical location, availability, quality, and the necessary legal information on over 50,000 species of Earth’s biodiversity, from unicellular to multicellular organisms. The GGBN Data Portal and the GGBN Data Standard are complementary to existing infrastructures such as the Global Biodiversity Information Facility (GBIF) and International Nucleotide Sequence Database (INSDC). Today, many well-known open-source collection management databases such as Arctos, Specify, and Symbiota, are implementing the GGBN data standard. GGBN continues to increase its collections strategically, based on the needs of the research community, adding over 1.3 million online records in 2018 alone, and today two million sample data are available through GGBN. Together with Consortium of European Taxonomic Facilities (CETAF), Society for the Preservation of Natural History Collections (SPNHC), Biodiversity Information Standards (TDWG), and Synthesis of Systematic Resources (SYNTHESYS+), GGBN provides best practices for biorepositories on meeting the requirements of the Nagoya Protocol on Access and Benefit Sharing (ABS). By collaboration with the Biodiversity Heritage Library (BHL), GGBN is exploring options for tagging publications that reference GGBN collections and associated specimens, made searchable through GGBN’s document library. Through its collaborative efforts, standards, and best practices GGBN aims at facilitating trust and transparency in the use of genetic resources.


Sign in / Sign up

Export Citation Format

Share Document