Mapping the Impact of Digitisation for Poorly Documented Countries: Mozambique as a case study

Biodiversity Information Science and Standards ◽

10.3897/biss.3.37025 ◽

2019 ◽

Vol 3 ◽

Author(s):

Isabel Neves ◽

Maria da Luz Mathias ◽

Cristiane Bastos-Silveira

Keyword(s):

Data Cleaning ◽

Data Sources ◽

Data Driven ◽

Biodiversity Data ◽

Species Occurrence ◽

Terrestrial Mammals ◽

Management Actions ◽

Occurrence Data ◽

The Impact ◽

Conservation And Management

Despite the rise of the global availability of biodiversity data by digitisation, essential regions of the world remain poorly documented (Peterson et al. 2015). Research-neglected regions that lack quality information, are mainly the species-rich and developing nations (Gaikwad and Chavan 2006). Mozambique is an African country without a wide-ranging knowledge regarding its fauna’s diversity and distribution (Neves et al. 2018). Undeniably, this country's knowledge gaps constitute a significant impediment for the improvement of effective conservation measures. Primary species occurrence data across dispersed data sources can be a cost-effective resource for boosting knowledge about a country’s biodiversity. Aiming to aggregate a comprehensive dataset of Mozambique’s terrestrial mammals, we compiled primary species occurrence data from dispersed data sources. The produced dataset not only gathered digitalised accessible knowledge (DAK) from the Global Biodiversity Information Facility (GBIF) and natural history collections, but also retrieved and digitalised species occurrence data enclosed in grey and scientific literature. Particularly for poorly documented countries, filling data gaps are crucial for new and broad insights for biodiversity research and preservation. Thus, quantification of the effects of data digitisation and mobilisation goes beyond the specific goals of organisations, institutions or data-sharing resources. The impact of data digitisation should be disseminated, not only by the number of publications and times data are accessed (Nelson and Ellis 2018), but also by the actual achievements in regions covered by DAK. To highlight the impact of further data digitisation in a poorly documented country, we examine the effective gain of further digitisation and data cleaning on the terrestrial mammals from Mozambique. We demonstrate the increase in the overall knowledge, not merely in terms of number of species, number of records, and country’s coverage, but from the production of outputs with potential value for data-driven conservation research and planning. More than 17000 records were compiled. The digitisation of data in literature as well as data cleaning and quality improvements resulted in a substantial increase in the amount of DAK, which acknowledges Mozambique’s high species diversity (Fig. 1). The digitisation and data mobilisation hereby described allowed for the update of the country’s terrestrial mammals checklist (Neves et al. 2018). The final dataset also expands the knowledge of the most poorly documented provinces, allowing generation of a data-driven proposal of priority areas to survey (in review). Also, an assessment of Mozambique’s conservation network effectiveness for mammal protection was performed, and additional relevant areas were suggested (in prep.). The dataset compiled is an important "stepping stone" towards an enhanced knowledge of Mozambique’s fauna. Biodiversity conservation and management in developing countries rich in natural resources, which often must deal with a lack of internal capacity for applied research and conservation actions, are challenges. Considering that digitisation and mobilisation of biodiversity data are resourceful processes for improving knowledge, collaborative work between institutions of those countries and international data-provider communities could, in the short term, successfully improve the information baseline to support decision-making in future conservation and management actions.

Download Full-text

Mapping Knowledge Gaps of Mozambique’s Terrestrial Mammals

Scientific Reports ◽

10.1038/s41598-019-54590-4 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Isabel Queirós Neves ◽

Maria da Luz Mathias ◽

Cristiane Bastos-Silveira

Keyword(s):

Species Conservation ◽

Cost Effective ◽

Management Plan ◽

Knowledge Gap ◽

Species Occurrence ◽

Knowledge Gaps ◽

Mammal Species ◽

Terrestrial Mammals ◽

Occurrence Data ◽

The Impact

AbstractA valuable strategy to support conservation planning is to assess knowledge gaps regarding primary species occurrence data to identify and select areas for future biodiversity surveys. Currently, increasing accessibility to these data allows a cost-effective method for boosting knowledge about a country’s biodiversity. For understudied countries where the lack of resources for conservation is more pronounced to resort to primary biodiversity data can be especially beneficial. Here, using a primary species occurrence dataset, we assessed and mapped Mozambique’s knowledge gaps regarding terrestrial mammal species by identifying areas that are geographically distant and environmentally different from well-known sites. By comparing gaps from old and recent primary species occurrence data, we identified: (i) gaps of knowledge over time, (ii) the lesser-known taxa, and (iii) areas with potential for spatiotemporal studies. Our results show that the inventory of Mozambique’s mammal fauna is near-complete in less than 5% of the territory, with broad areas of the country poorly sampled or not sampled at all. The knowledge gap areas are mostly associated with two ecoregions. The provinces lacking documentation coincide with areas over-explored for natural resources, and many such sites may never be documented. It is our understanding that by prioritising the survey of the knowledge-gap areas will likely produce new records for the country and, continuing the study of the well-known regions will guarantee their potential use for spatiotemporal studies. The implemented approach to assess the knowledge gaps from primary species occurrence data proved to be a powerful strategy to generate information that is essential to species conservation and management plan. However, we are aware that the impact of digital and openly available data depends mostly on its completeness and accuracy, and thus we encourage action from the scientific community and government authorities to support and promote data mobilisation.

Download Full-text

Multi-species occurrence models to evaluate the effects of conservation and management actions

Biological Conservation ◽

10.1016/j.biocon.2009.11.016 ◽

2010 ◽

Vol 143 (2) ◽

pp. 479-484 ◽

Cited By ~ 147

Author(s):

Elise F. Zipkin ◽

J. Andrew Royle ◽

Deanna K. Dawson ◽

Scott Bates

Keyword(s):

Species Occurrence ◽

Management Actions ◽

Conservation And Management

Download Full-text

Phylogeny Based Biodiversity Data Queries

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25589 ◽

2018 ◽

Vol 2 ◽

pp. e25589

Author(s):

Scott Chamberlain

Keyword(s):

Research Work ◽

Sister Group ◽

Data Sources ◽

Work Flow ◽

Use Case ◽

Biodiversity Data ◽

R Software ◽

Occurrence Data ◽

Simple Query ◽

The Ideal

There is a large amount of publicly available biodiversity data from many different data sources. When doing research, one ideally interacts with biodiversity data programmatically so their work is reproducible. The entry point to biodiversity data records is largely through taxonomic names, or common names in some cases (e.g., birds). However, many researchers have a phylogeny focused project, meaning taxonomic names are not the ideal interface to biodiversity data. Ideally, it would be simple to programmatically go from a phylogeny to biodiversity records through a phylogeny based query. I'll discuss a new project `phylodiv` (https://github.com/ropensci/phylodiv/) that attempts to facilitate phylogeny based biodiversity data collection (see Fig. 1). The project takes the form of an R software package. The idea is to make the user interface take essentially two inputs: a phylogeny and a phylogeny based question. Behind the scenes we'll do many things, including gathering taxonomic names and hierarchies for the taxa in the phylogeny, send queries to GBIF (or other data sources), and map the results. The user will of course have control over the behind the scenes parts, but I imagine the majority use case will be to input a phylogeny and a question and expect an answer back. We already have R tools to do nearly all parts of the work-flow shown above: there's a large number of phylogeny tools, `taxize`/`taxizedb` can handle taxonomic name collection, while `rgbif` can handle interaction with GBIF, and there's many mapping options in R. There are a few areas that need work still however. First, there's not yet a clear way to do a phylogeny based query. Ideally a user will be able to express a simple query like "taxon A vs. its sister group". That's simple to imagine, but to implement that in software is another thing. Second, users ideally would like answers back - in this case a map of occurrences - relatively quickly to be able to iterate on their research work-flow. The most likely solution to this will be to use GBIF's map tile service to visualize binned occurrence data, but we'll need to explore this in detail to make sure it works.

Download Full-text

speciesgeocodeR: An R package for linking species occurrences, user-defined regions and phylogenetic trees for biogeography, ecology and evolution

10.1101/032755 ◽

2015 ◽

Cited By ~ 6

Author(s):

Alexander Zizka ◽

Alexandre Antonelli

Keyword(s):

Data Quality ◽

Phylogenetic Trees ◽

Large Scale ◽

Data Cleaning ◽

R Package ◽

Species Occurrence ◽

Occurrence Data ◽

User Friendly ◽

Species Occurrences

1. Large-scale species occurrence data from geo-referenced observations and collected specimens are crucial for analyses in ecology, evolution and biogeography. Despite the rapidly growing availability of such data, their use in evolutionary analyses is often hampered by tedious manual classification of point occurrences into operational areas, leading to a lack of reproducibility and concerns regarding data quality. 2. Here we present speciesgeocodeR, a user-friendly R-package for data cleaning, data exploration and data visualization of species point occurrences using discrete operational areas, and linking them to analyses invoking phylogenetic trees. 3. The three core functions of the package are 1) automated and reproducible data cleaning, 2) rapid and reproducible classification of point occurrences into discrete operational areas in an adequate format for subsequent biogeographic analyses, and 3) a comprehensive summary and visualization of species distributions to explore large datasets and ensure data quality. In addition, speciesgeocodeR facilitates the access and analysis of publicly available species occurrence data, widely used operational areas and elevation ranges. Other functionalities include the implementation of minimum occurrence thresholds and the visualization of coexistence patterns and range sizes. SpeciesgeocodeR accompanies a richly illustrated and easy-to-follow tutorial and help functions.

Download Full-text

Turks and Caicos rock iguana (Cyclura carinata): Conservation and management plan 2020–2024

10.2305/iucn.ch.2021.10.en ◽

2021 ◽

Keyword(s):

Management Plan ◽

Iucn Red List ◽

Tourism Industry ◽

The Bahamas ◽

Term Survival ◽

Management Actions ◽

In The Wild ◽

Cyclura Carinata ◽

The Impact ◽

Conservation And Management

The Endangered Turks and Caicos rock iguana, Cyclura carinata, is found only on the islands and cays of Turks and Caicos Islands (TCI), and on Booby Cay in The Bahamas, northwest of Providenciales. These iguanas now occupy less than 10 percent of their historic range largely due to the impact of invasive mammalian predators. Although conservation efforts have led to stabilisation of the population resulting in the 2020 down-listing of this species from Critically Endangered to Endangered on the IUCN Red List of Threatened Species, threats persist and management efforts are needed. This document presents a comprehensive four-year plan for the conservation and management actions considered essential to ensuring the long-term survival of Cyclura carinata in the wild. This document combines knowledge and expertise from local government, local and international NGOs, the tourism industry, educators, homeowners, private island managers, civil society, and members of the IUCN SSC Iguana Specialist Group working in the TCI.

Download Full-text

Using mobile big data to support emergency preparedness and address economically vulnerable communities during the COVID-19 pandemic in Nigeria

Data & Policy ◽

10.1017/dap.2021.12 ◽

2021 ◽

Vol 3 ◽

Author(s):

Joanne Gilbert ◽

Olubayo Adekanmbi ◽

Charlie Harrison

Keyword(s):

Big Data ◽

Resource Planning ◽

Data Sources ◽

Data Driven ◽

Second Phase ◽

Case Scenario ◽

Worst Case ◽

Mobile Big Data ◽

The Government ◽

The Impact

Abstract With the declaration of the coronavirus disease 2019 (COVID-19) pandemic in Nigeria in 2020, the Nigeria Governors’ Forum (NGF) instigated a collaboration with MTN Nigeria to develop data-driven insights, using mobile big data (MBD) and other data sources, to shape the planning and response to the pandemic. First, a model was developed to predict the worst-case scenario for infections in each state. This was used to support state-level health committees to make local resource planning decisions. Next, as containment interventions resulted in subsistence/daily paid workers losing their income and ability to buy essential food supplies, NGF and MTN agreed a second phase of activity, to develop insights to understand the population clusters at greatest socioeconomic risk from the impact of the pandemic. This insight was used to promote available financial relief to the economically vulnerable population clusters in Lagos state via the HelpNow crowdfunding initiative. This article discusses how anonymized and aggregated mobile network data (MBD), combined with other data sources, were used to create valuable insights and inform the government, and private business, response to the pandemic in Nigeria. Finally, we discuss lessons learnt. Firstly, how a collaboration with, and support from, the regulator enabled MTN to deliver critical insights at a national scale. Secondly, how the Nigeria Data Protection Regulation and the GSMA COVID-19 Privacy Guidelines provided an initial framework to open the discussion and define the approach. Thirdly, why stakeholder management is critical to the understanding, and application, of insights. Fourthly, how existing relationships ease new project collaborations. Finally, how MTN is developing future preparedness by creating a team that is focused on developing data-driven insights for social good.

Download Full-text

SpOccSum: An easy-to-use Python tool to summarize species occurrence data from material examined lists in taxonomic revisions

Biodiversity Information Science and Standards ◽

10.3897/biss.3.36513 ◽

2019 ◽

Vol 3 ◽

Author(s):

Michael Trizna ◽

Torsten Dikow

Keyword(s):

Data Science ◽

Biodiversity Data ◽

Species Occurrence ◽

Global Biodiversity Information Facility ◽

Seasonal Incidence ◽

Distribution Maps ◽

Occurrence Data ◽

Darwin Core ◽

Biodiversity Information ◽

Northern And Southern Hemispheres

Taxonomic revisions contain crucial biodiversity data in the material examined sections for each species. In entomology, material examined lists minimally include the collecting locality, date of collection, and the number of specimens of each collection event. Insect species might be represented in taxonomic revisions by only a single specimen or hundreds to thousands of specimens. Furthermore, revisions of insect genera might treat small genera with few species or include tens to hundreds of species. Summarizing data from such large and complex material examined lists and revisions is cumbersome, time-consuming, and prone to errors. However, providing data on the seasonal incidence, abundance, and collecting period of species is an important way to mobilize primary biodiversity data to understand a species’s occurrence or rarity. Here, we present SpOccSum (Species Occurrence Summary)—a tool to easily obtain metrics of seasonal incidence from specimen occurrence data in taxonomic revisions. SpOccSum is written in Python (Python Software Foundation 2019) and accessible through the Anaconda Python/R Data Science Platform as a Jupyter Notebook (Kluyver et al. 2016). The tool takes a simple list of specimen data containing species name, locality, date of collection (preferably separated by day, month, and year), and number of specimens in CSV format and generates a series of tables and graphs summarizing: number of specimens per species, number of specimens collected per month, number of unique collection events, as well as earliest, and most recent collecting year of each species. number of specimens per species, number of specimens collected per month, number of unique collection events, as well as earliest, and most recent collecting year of each species. The results can be exported as graphics or as csv-formatted tables and can easily be included in manuscripts for publication. An example of an early version of the summary produced by SpOccSum can be viewed in Tables 1, 2 from Markee and Dikow (2018). To accommodate seasonality in the Northern and Southern Hemispheres, users can choose to start the data display with either January or July. When geographic coordinates are available and species have widespread distributions spanning, for example, the equator, the user can itemize particular regions such as North of Tropic of Cancer (23.5˚N), Tropic of Cancer to the Equator, Equator to Tropic of Capricorn, and South of Tropic of Capricorn (23.5˚S). Other features currently in development include the ability to produce distribution maps from the provided data (when geographic coordinates are included) and the option to export specimen occurrence data as a Darwin-Core Archive ready for upload to the Global Biodiversity Information Facility (GBIF).

Download Full-text

Big(ger) data as better data in open distance learning

The International Review of Research in Open and Distributed Learning ◽

10.19173/irrodl.v16i1.1948 ◽

2015 ◽

Vol 16 (1) ◽

Cited By ~ 21

Author(s):

Paul Prinsloo ◽

Elizabeth Archer ◽

Glen Barnes ◽

Yuraisha Chetty ◽

Dion Van Zyl

Keyword(s):

Big Data ◽

Data Sources ◽

Data Driven ◽

Descriptive Case Study ◽

Student Data ◽

Current State ◽

The University ◽

The Impact ◽

Dominant Paradigm

In the context of the hype, promise and perils of Big Data and the currently dominant paradigm of data-driven decision-making, it is important to critically engage with the potential of Big Data for higher education. We do not question the potential of Big Data, but we do raise a number of issues, and present a number of theses to be seriously considered in realising this potential.The University of South Africa (Unisa) is one of the mega ODL institutions in the world with more than 360,000 students and a range of courses and programmes. Unisa already has access to a staggering amount of student data, hosted in disparate sources, and governed by different processes. As the university moves to mainstreaming online learning, the amount of and need for analyses of data are increasing, raising important questions regarding our assumptions, understanding, data sources, systems and processes.This article presents a descriptive case study of the current state of student data at Unisa, as well as explores the impact of existing data sources and analytic approaches. From the analysis it is clear that in order for big(ger) data to be better data, a number of issues need to be addressed. The article concludes by presenting a number of theses that should form the basis for the imperative to optimise the harvesting, analysis and use of student data.

Download Full-text

Sharing the Decision Process Framework to Identify Well-supported Records of Mammal Species-occurrence in Mozambique

Biodiversity Information Science and Standards ◽

10.3897/biss.3.35265 ◽

2019 ◽

Vol 3 ◽

Author(s):

Isabel Neves ◽

Maria da Luz Mathias ◽

Cristiane Bastos-Silveira

Keyword(s):

Selection Process ◽

Grey Literature ◽

Species Selection ◽

Digital Data ◽

Biodiversity Data ◽

Species Occurrence ◽

Mammal Species ◽

Terrestrial Mammals ◽

Single Author ◽

Species Checklist

Conservation research and policies tend to be significantly restricted wherever relevant data on biodiversity is sparse, scattered or non-curated. Thus, the usefulness of occurrence data, for the study of biodiversity, depends not only on the availability but also on data quality. Notwithstanding the increase in the global availability of primary biodiversity data, they have numerous shortfalls, from incomplete or partially erroneous documentation to spatial and temporal biases (Hortal et al. 2015, Aubry et al. 2017). Also, many non-digitized specimen collections, scientific publications and grey literature are locked as printed or digital publications. We integrated existing knowledge, from dispersed sources of biodiversity data, namely Global Biodiversity Information Facility (GBIF), natural history collections, wildlife survey reports, species checklist and other scientific literature. This procedure allowed an update of Mozambique’s checklist of terrestrial mammals (Neves et al. 2018). Despite the potential from digital data to overcome gaps of knowledge, a relevant constraint on creating or updating species checklist is the dificulty to access spatially-disperse collections and examine every specimens upon which occurrences are based. To partly overcome this impediment, we developed a species selection process for specimen data from GBIF and museums (Fig. 1). The aim was to categorise the species detected in more than one data source as species with the well-supported occurrence. In addition to the number of collectors, we also accounted for the number of records collected and presented in Smithers and Tello (1976), the last checklist produced for Mozambique’ mammals. A species-occurrence record was considered well-supported and included into the species checklist when was: independently recorded by different collectors or recorded by a single collector but listed in Smithers and Tello (1976). independently recorded by different collectors or recorded by a single collector but listed in Smithers and Tello (1976). An additional list was produced which contained species with questionable occurrence in the country. Species entered this "questionable occurrence" list when they were: not listed in Smithers and Tello (1976), and a single record supported its presence in the country; not listed in Smithers and Tello (1976) and multiple records exist, but were all cited by a single author; or registered with a single record in Smithers and Tello (1976). not listed in Smithers and Tello (1976), and a single record supported its presence in the country; not listed in Smithers and Tello (1976) and multiple records exist, but were all cited by a single author; or registered with a single record in Smithers and Tello (1976). We compiled more than 17000 records, resulting in a total of 217 species (14 orders, 39 families and 133 genera) with supported occurrence in Mozambique and 23 species with questionable reported occurrence (Table 1). The proposed approach for species selection can be adapted and function as a powerful tool to update species checklists of countries facing similar lack of knowledge regarding their biodiversity. The capacity to pinpointing species and specimens in need of occurrence and taxonomic re-evaluation is of great value to optimise collection’s study and to boost collaboration between curators and researchers. Lastly, considering that most records integrated are from European and North American institutions, this work would significantly improve with the integration of data from African institutions. Therefore, an effort should be made to make these essential collections accessible online.

Download Full-text

Species occurrence data from the Range-Wide Bull Trout eDNA Project

Forest Service Research Data Archive ◽

10.2737/rds-2017-0038 ◽

2017 ◽

Cited By ~ 1

Author(s):

Michael K. Young ◽

Daniel J. Isaak ◽

Kevin S. McKelvey ◽

Michael K. Schwartz ◽

Kellie J. Carim ◽

...

Keyword(s):

Bull Trout ◽

Species Occurrence ◽

Occurrence Data

Download Full-text