scholarly journals ‘As Open as Possible, as Closed as Necessary’ – Managing legal and owner-defined restrictions to openness of biodiversity data

Author(s):  
Kari Lahti ◽  
Leif Schulman ◽  
Esko Piirainen ◽  
Ville-Matti Riihikoski ◽  
Aino Juslén

The Finnish Biodiversity Information Facility FinBIF receives, stores and manages biodiversity data mobilised in Finland, and shares the data through its own portal (species.fi) and through Global Biodiversity Information Facility GBIF. FinBIF’s data policy (data policy) embraces the European FAIR data principles (FAIR - Findable, Accessible, Interoperable, Reusable; Wilkinson (2016)) but also incorporates specific restrictions stemming from national legislation, researchers’ needs, and data owners’ requirements. Here, we describe how the necessary, due to various reasons from sensitivity of the data to research embargo, restrictions to openness have been defined and implemented on the policy level and in technical data infrastructure solutions. We hope to contribute to an improvement of data management in the international biodiversity data infrastructures. In Finland, the law prohibits public authorities from distributing occurrence data if this causes increased threat to endangered species. However, neither the definition of ‘endangered species’ nor guidelines for the evaluation of potential risk by openness of data are formulated. To enable mobilisation of datasets containing information on endangered species, FinBIF convened a task force commissioned to set rules on data distribution, which respect the spirit of the law. The task force consisted of representatives of relevant data holding authorities and it consulted a wide group of taxon experts and the species information community. First, a list of species, judged to be among those targeted by the spirit of the law, was created (sensitive species data). Then the rules of restriction were decided on for each of the species. Measures of restriction ranged from complete non-disclosure of data to temporal and spatial restrictions. The identified safeguards concerning the sensitive data management in all use cases led us to create a series of innovative solutions Researchers often wish to restrict the openness of data they have gathered for research purposes These restrictions include embargo periods, limitations on the precision of data and controls on how the data is used. In many cases, however, researchers are willing to allow unrestricted official use of their data in certain cases such as for conservation management or land use planning. In these cases they will often allow storage and restricted use of exact data without an embargo. The same may be true for other data owners, such as nongovernmental organisations (NGOs) or private citizens. To support restrictions to openness, while simultaneously securing mobilisation of valuable datasets, FinBIF applies data sharing contracts including, as a rule, a precondition to share the original data with the public authorities for official use under the Creative Commons 4.0 BY -licence (CC 4.0 BY). The technical solution to enabling the rather complex data policy is that FinBIF stores the collated data in two separate data warehouses: a public one for the distribution of fully open data and temporally and spatially coarsened sensitive data, alongside another containing all data but with restricted access to authorised users. In addition, to allow case-by-case release of restricted data, FinBIF has developed a data request function (Fig. 1). When users of the open data retrieve a dataset using, e.g., taxonomic and spatial filtering, they receive a search result stating whether there are restricted data available based on the filters used. In these cases a user can issue a data request, automatically distributed to all owners of data contained in the collated data batch. Agreeing on the principles about how to apply restrictions to data openness and how to define authoritative use, has not been easy given the lack of precedents. It has required thorough and inclusive consultation with both state administration, conservation practitioners, scientific specialists and lawyers. The two main cultural constraints to overcome have been (1) embracing the FAIR principles of truly ‘as open as possible’ and only ‘as closed as (absolutely) necessary (European Commission 2016); and, perhaps surprisingly, (2) figuring out novel ways to work across different state administrative sectors to share data.

Author(s):  
Nora Escribano ◽  
David Galicia ◽  
Arturo H. Ariño

Building on the development of Biodiversity Informatics, the Global Biodiversity Information Facility (GBIF) undertook the task of enabling access to the world’s wealth of biodiversity data via the Internet. To date, GBIF has become, in many respects, the most extensive biodiversity information exchange infrastructure in the world, opening up a full range of possibilities for science. Science has benefited from such access to biodiversity data in research areas ranging from the effects of environmental change on biodiversity to the spread of invasive species, among many others. As of this writing, more than 7,000 published items (scientific papers, reviews, conference proceedings) have been indexed in the GBIF Secretariat’s literature tracking programme. On the basis on this database, we will represent trends in GBIF in the users’ behaviour over time regarding openness, social structure, and other features associated to such scientific production: what is the measurable impact of research using GBIF data? How is the GBIF community of users growing? Is the science made with, and enabled by, open data, actually open? Mapping GBIF users’ choices will show how biodiversity research is evolving through time, synthesising past and current priorities of this community in an attempt to forecast whether summer—or winter—is coming.


Author(s):  
Nils Walravens ◽  
Pieter Ballon ◽  
Mathias Van Compernolle ◽  
Koen Borghys

AbstractAs part of the rhetoric surrounding the Smart City concept, cities are increasingly facing challenges related to data (management, governance, processing, storage, publishing etc.). The growing power acquired by the data market and the great relevance assigned to data ownership rather than to data-exploitation knowhow is affecting the development of a data culture and is slowing down the embedding of data-related expertise inside public administrations. Concurrently, policies call for more open data to foster service innovation and government transparency. What are the consequences of these phenomena when imagining the potential for policy making consequent to the growing data quantity and availability? Which strategic challenges and decisions do public authorities face in this regard? What are valuable approaches to arm public administrations in this “war on data”? The Smart Flanders program was initiated by the Flemish Government (Belgium) in 2017 to research and support cities with defining and implementing a common open data policy. As part of the program, a “maturity check” was performed, evaluating the cities on several quantitative and qualitative parameters. This exercise laid to bare some challenges in the field of open data and led to a checklist that cities can employ to begin tackling them, as well as a set of model clauses to be used in the procurement of new technologies.


2005 ◽  
Vol 4 (2) ◽  
pp. 393-400
Author(s):  
Pallavali Radha ◽  
G. Sireesha

The data distributors work is to give sensitive data to a set of presumably trusted third party agents.The data i.e., sent to these third parties are available on the unauthorized places like web and or some ones systems, due to data leakage. The distributor must know the way the data was leaked from one or more agents instead of as opposed to having been independently gathered by other means. Our new proposal on data allocation strategies will improve the probability of identifying leakages along with Security attacks typically result from unintended behaviors or invalid inputs.  Due to too many invalid inputs in the real world programs is labor intensive about security testing.The most desirable thing is to automate or partially automate security-testing process. In this paper we represented Predicate/ Transition nets approach for security tests automated generationby using formal threat models to detect the agents using allocation strategies without modifying the original data.The guilty agent is the one who leaks the distributed data. To detect guilty agents more effectively the idea is to distribute the data intelligently to agents based on sample data request and explicit data request. The fake object implementation algorithms will improve the distributor chance of detecting guilty agents.


2021 ◽  
Vol 2 (1) ◽  
Author(s):  
Zilmara Alves da Silva ◽  
Maria Helena Santana Cruz

This research aims to analyze the resocialization process of the second generation of adolescents and young people from the Meninos de Deus project and the contributions of socio-affective relationships in the resignification of individual trajectory in the context of violence in the Santa Filomena community. The study is necessary to understand the importance of strengthening the resocialization processes in an open space, which has the triad of public authorities, civil society and the community as the executing nucleus of socio-educational measures. The Meninos de Deus group was born in 2007 and was born from a pact, among youths in conflict with the law, based on the premise of mutual care, commitment to life and in the re-socializing walk with the community. In this group, the feeling of belonging is opposed to the feeling that young people and adolescents in conflict with the law had with the youth gang or the criminal faction they belonged. The methodology to be used is ethnography, where we will use field research, characterized as an integration of data obtained in the field and by bibliographic reading.


Author(s):  
Lyubomir Penev ◽  
Teodor Georgiev ◽  
Viktor Senderov ◽  
Mariya Dimitrova ◽  
Pavel Stoev

As one of the first advocates of open access and open data in the field of biodiversity publishiing, Pensoft has adopted a multiple data publishing model, resulting in the ARPHA-BioDiv toolbox (Penev et al. 2017). ARPHA-BioDiv consists of several data publishing workflows and tools described in the Strategies and Guidelines for Publishing of Biodiversity Data and elsewhere: Data underlying research results are deposited in an external repository and/or published as supplementary file(s) to the article and then linked/cited in the article text; supplementary files are published under their own DOIs and bear their own citation details. Data deposited in trusted repositories and/or supplementary files and described in data papers; data papers may be submitted in text format or converted into manuscripts from Ecological Metadata Language (EML) metadata. Integrated narrative and data publishing realised by the Biodiversity Data Journal, where structured data are imported into the article text from tables or via web services and downloaded/distributed from the published article. Data published in structured, semanticaly enriched, full-text XMLs, so that several data elements can thereafter easily be harvested by machines. Linked Open Data (LOD) extracted from literature, converted into interoperable RDF triples in accordance with the OpenBiodiv-O ontology (Senderov et al. 2018) and stored in the OpenBiodiv Biodiversity Knowledge Graph. Data underlying research results are deposited in an external repository and/or published as supplementary file(s) to the article and then linked/cited in the article text; supplementary files are published under their own DOIs and bear their own citation details. Data deposited in trusted repositories and/or supplementary files and described in data papers; data papers may be submitted in text format or converted into manuscripts from Ecological Metadata Language (EML) metadata. Integrated narrative and data publishing realised by the Biodiversity Data Journal, where structured data are imported into the article text from tables or via web services and downloaded/distributed from the published article. Data published in structured, semanticaly enriched, full-text XMLs, so that several data elements can thereafter easily be harvested by machines. Linked Open Data (LOD) extracted from literature, converted into interoperable RDF triples in accordance with the OpenBiodiv-O ontology (Senderov et al. 2018) and stored in the OpenBiodiv Biodiversity Knowledge Graph. The above mentioned approaches are supported by a whole ecosystem of additional workflows and tools, for example: (1) pre-publication data auditing, involving both human and machine data quality checks (workflow 2); (2) web-service integration with data repositories and data centres, such as Global Biodiversity Information Facility (GBIF), Barcode of Life Data Systems (BOLD), Integrated Digitized Biocollections (iDigBio), Data Observation Network for Earth (DataONE), Long Term Ecological Research (LTER), PlutoF, Dryad, and others (workflows 1,2); (3) semantic markup of the article texts in the TaxPub format facilitating further extraction, distribution and re-use of sub-article elements and data (workflows 3,4); (4) server-to-server import of specimen data from GBIF, BOLD, iDigBio and PlutoR into manuscript text (workflow 3); (5) automated conversion of EML metadata into data paper manuscripts (workflow 2); (6) export of Darwin Core Archive and automated deposition in GBIF (workflow 3); (7) submission of individual images and supplementary data under own DOIs to the Biodiversity Literature Repository, BLR (workflows 1-3); (8) conversion of key data elements from TaxPub articles and taxonomic treatments extracted by Plazi into RDF handled by OpenBiodiv (workflow 5). These approaches represent different aspects of the prospective scholarly publishing of biodiversity data, which in a combination with text and data mining (TDM) technologies for legacy literature (PDF) developed by Plazi, lay the ground of an entire data publishing ecosystem for biodiversity, supplying FAIR (Findable, Accessible, Interoperable and Reusable data to several interoperable overarching infrastructures, such as GBIF, BLR, Plazi TreatmentBank, OpenBiodiv and various end users.


Author(s):  
Natalya Ivanova ◽  
Maxim Shashkov

Currently Russia doesn't have a national biodiversity information system, and is still not a GBIF (Global Biodiversity Information Facility) member. Nevertheless, GBIF is the largest source of biodiversity data for Russia. As of August 2020, >5M species occurrences were available through the GBIF portal, of which 54% were published by Russian organisations. There are 107 institutions from Russia that have become GBIF publishers and 357 datasets have been published. The important trend of data mobilization in Russia is driven by the considerable contribution of citizen science. The most popular platform is iNaturalist. This year, the related GBIF dataset (Ueda 2020) became the largest one for Russia (793,049 species occurrences as of 2020-08-11). The first observation for Russia was posted in 2011, but iNaturalist started becoming popular in 2017. That year, 88 observers added >4500 observations that represented 1390 new species for Russia, 7- and 2-fold more respectively, than for the previous 6 years. Now we have nearly 12,000 observers, about 15,000 observed species and >1M research-grade observations. The ratio of observations for Tracheophyta, Chordata, and Arthropoda in Russia is different compared to the global scale. There are almost an equal amount of observations in the global iNaturalist GBIF dataset for these groups. At the same time in Russia, vascular plants make up 2/3rds of the observations. That is due to the "Flora of Russia" project, which attracted many professional botanists both as observers and experts. Thanks to their activity, Russia has a high proportion of research-grade observations in iNaturalist, 78% versus 60% globally. Another consequence of wide participation by professional researchers is the high rate of species accumulation. For some taxonomic groups conspicuous species were already revealed. There are about 850 bird species in Russia of which 398 species were observed in 2018, and only 83 new species in 2019. Currently, the number of new species recorded over time is decreasing despite the increase in observers and overall user activity. Russian iNaturalist observers have shared a lot of archive photos (taken during past years). In 2018, it was nearly 1/4 of the total number of observations and about 3/4 of new species for the year, with similar trends observed during 2019. Usually archive photos are posted from December until April, but the 2020 pandemic lockdown spurred a new wave of archive photo mobilisation in April and May. There are many iNaturalist projects for protected areas in Russia: 27 for strict nature reserves and national parks, and about 300 for others. About 100,000 observations (7.5% of all Russian observations) from the umbrella project "Protected areas of Russia" represent >34% of the species diversity observed in Russia. For some regions, e.g., Novosibirsk, Nizhniy Novgorod and Vladimir Oblasts, almost all protected areas are covered by iNaturalist projects, and are often their only source of available biodiversity data. There are also other popular citizen science platforms developed by Russian researchers. The first one is the Russian birdwatching network RU-BIRDS.RU. The related GBIF dataset (Ukolov et al. 2019) is the third largest dataset for Russia (>370,000 species occurrences). Another Russian citizen science system is wildlifemonitoring.ru, which includes thematic resources for different taxonomic groups of vertebrates. This is the crowd-sourced web-GIS maintained by the Siberian Environmental Center NGO in Novosibirsk. It is noteworthy that iNaturalist activities in Russia are developed more as a social network than as a way to attract volunteers to participate in scientific research. Of 746 citations in the iNaturalist dataset, only 18 articles include co-authors from Russia. iNaturalist data are used for the management of regional red lists (in the Republic of Bashkortostan, Novosibirsk Oblast and others), and as an additional information source for regional inventories. RU-BIRDS data were used in the European Russia Breeding Bird Atlas and the new edition of the European Breeding Bird Atlas. In Russia, citizen science activities significantly contribute to filling gaps in the global biodiversity map. However, Russian iNaturalist observations available through GBIF originate from the USA. It is not ideal, because the iNaturalist GBIF dataset is growing rapidly, and in the future it will represent more than all other datasets for Russia combined. In our opinion, iNaturalist data should be repatriated during the process of publishing through GBIF, as it is implemented for the eBird dataset (Levatich and Ligocki 2020).


Author(s):  
Yvan Le Bras ◽  
Aurélie Delavaud ◽  
Dominique Pelletier ◽  
Jean-Baptiste Mihoub

Most biodiversity research aims at understanding the states and dynamics of biodiversity and ecosystems. To do so, biodiversity research increasingly relies on the use of digital products and services such as raw data archiving systems (e.g. structured databases or data repositories), ready-to-use datasets (e.g. cleaned and harmonized files with normalized measurements or computed trends) as well as associated analytical tools (e.g. model scripts in Github). Several world-wide initiatives facilitate the open access to biodiversity data, such as the Global Biodiversity Information Facility (GBIF) or GenBank, Predicts etc. Although these pave the way towards major advances in biodiversity research, they also typically deliver data products that are sometimes poorly informative as they fail to capture the genuine ecological information they intend to grasp. In other words, access to ready-to-use aggregated data products may sacrifice ecological relevance for data harmonization, resulting in over-simplified, ill-advised standard formats. This is singularly true when the main challenge is to match complementary data (large diversity of measured variables, integration of different levels of life organizations etc.) collected with different requirements and scattered in multiple databases. Improving access to raw data, and meaningful detailed metadata and analytical tools associated with standardized workflows is critical to maintain and maximize the generic relevance of ecological data. Consequently, advancing the design of digital products and services is essential for interoperability while also enhancing reproducibility and transparency in biodiversity research. To go further, a minimal common framework organizing biodiversity observation and data organization is needed. In this regard, the Essential Biodiversity Variable (EBV) concept might be a powerful way to boost progress toward this goal as well as to connect research communities worldwide. As a national Biodiversity Observation Network (BON) node, the French BON is currently embodied by a national research e-infrastructure called "Pôle national de données de biodiversité" (PNDB, formerly ECOSCOPE), aimed at simultaneously empowering the quality of scientific activities and promoting networking within the scientific community at a national level. Through the PNDB, the French BON is working on developing biodiversity data workflows oriented toward end services and products, both from and for a research perspective. More precisely, the two pillars of the PNDB are a metadata portal and a workflow-oriented web platform dedicated to the access of biodiversity data and associated analytical tools (Galaxy-E). After four years of experience, we are now going deeper into metadata specification, dataset descriptions and data structuring through the extensive use of Ecological Metadata Language (EML) as a pivot format. Moreover, we evaluate the relevance of existing tools such as Metacat/Morpho and DEIMS-SDR (Dynamic Ecological Information Management System - Site and dataset registry) in order to ensure a link with other initiatives like Environmental Data Initiative, DataOne and Long-Term Ecological Research related observation networks. Regarding data analysis, an open-source Galaxy-E platform was launched in 2017 as part of a project targeting the design of a citizen science observation system in France (“65 Millions d'observateurs”). Here, we propose to showcase ongoing French activities towards global challenges related to biodiversity information and knowledge dissemination. We particularly emphasize our focus on embracing the FAIR (findable, accessible, interoperable and reusable) data principles Wilkinson et al. 2016 across the development of the French BON e-infrastructure and the promising links we anticipate for operationalizing EBVs. Using accessible and transparent analytical tools, we present the first online platform allowing the performance of advanced yet user-friendly analyses of biodiversity data in a reproducible and shareable way using data from various data sources, such as GBIF, Atlas of Living Australia (ALA), eBIRD, iNaturalist and environmental data such as climate data.


2018 ◽  
Author(s):  
Alyssa H. Rosemartin ◽  
Madison L. Langseth ◽  
Theresa M. Crimmins ◽  
Jake F. Weltzin

Author(s):  
Gil Nelson ◽  
Deborah L Paul

Integrated Digitized Biocollections (iDigBio) is the United States’ (US) national resource and coordinating center for biodiversity specimen digitization and mobilization. It was established in 2011 through the US National Science Foundation’s (NSF) Advancing Digitization of Biodiversity Collections (ADBC) program, an initiative that grew from a working group of museum-based and other biocollections professionals working in concert with NSF to make collections' specimen data accessible for science, education, and public consumption. The working group, Network Integrated Biocollections Alliance (NIBA), released two reports (Beach et al. 2010, American Institute of Biological Sciences 2013) that provided the foundation for iDigBio and ADBC. iDigBio is restricted in focus to the ingestion of data generated by public, non-federal museum and academic collections. Its focus is on specimen-based (as opposed to observational) occurrence records. iDigBio currently serves about 118 million transcribed specimen-based records and 29 million specimen-based media records from approximately 1600 datasets. These digital objects have been contributed by about 700 collections representing nearly 400 institutions and is the most comprehensive biodiversity data aggregator in the US. Currently, iDigBio, DiSSCo (Distributed System of Scientific Collections), GBIF (Global Biodiversity Information Facility), and the Atlas of Living Australia (ALA) are collaborating on a global framework to harmonize technologies towards standardizing and synchronizing ingestion strategies, data models and standards, cyberinfrastructure, APIs (application programming interface), specimen record identifiers, etc. in service to a developing consolidated global data product that can provide a common source for the world’s digital biodiversity data. The collaboration strives to harness and combine the unique strengths of its partners in ways that ensure the individual needs of each partner’s constituencies are met, design pathways for accommodating existing and emerging aggregators, simultaneously strengthen and enhance access to the world’s biodiversity data, and underscore the scope and importance of worldwide biodiversity informatics activities. Collaborators will share technology strategies and outputs, align conceptual understandings, and establish and draw from an international knowledge base. These collaborators, along with Biodiversity Information Standards (TDWG), will join iDigBio and the Smithsonian National Museum of Natural History as they host Biodiversity 2020 in Washington, DC. Biodiversity 2020 will combine an international celebration of the worldwide progress made in biodiversity data accessibility in the 21st century with a biodiversity data conference that extends the life of Biodiversity Next. It will provide a venue for the GBIF governing board meeting, TDWG annual meeting, and the annual iDigBio Summit as well as three days of plenary and concurrent sessions focused on the present and future of biodiversity data generation, mobilization, and use.


Oryx ◽  
1992 ◽  
Vol 26 (4) ◽  
pp. 202-204 ◽  
Author(s):  
J. L. Cloudsley-Thompson

While Saudi Arabia has recognized the dangers of uncontrolled hunting and has introduced conservation measures in its own territory, prominent members of that kingdom are killing large numbers of game, including endangered species, in neighbouring countries. In this report the author presents evidence of the devastation caused by Saudi hunters in the Sudan. While the latter country has outlawed hunting, enforcing the law against Saudi nationals is fraught with difficulties.


Sign in / Sign up

Export Citation Format

Share Document