ABCD 3.0 Ready to Use

Biodiversity Information Science and Standards ◽

10.3897/biss.3.37214 ◽

2019 ◽

Vol 3 ◽

Author(s):

David Fichtmueller ◽

Walter G. Berendsohn ◽

Gabriele Droege ◽

Falko Glöckler ◽

Anton Güntsch ◽

...

Keyword(s):

Semantic Annotation ◽

Xml Schema ◽

Observation Data ◽

Darwin Core ◽

Individual Concepts ◽

Semantic Concepts ◽

The Creation ◽

Description Framework ◽

Interactive Modelling ◽

Machine Readable

The TDWG standard ABCD (Access to Biological Collections Data task group 2007) was aimed at harmonizing terminologies used for modelling biological collection information and is used as a comprehensive data format for transferring collection and observation data between software components. The project ABCD 3.0 (A community platform for the development and documentation of the ABCD standard for natural history collections) was financed by the German Research Council (DFG). It addressed the transformation of ABCD into a semantic web-compliant ontology by deconstructing the XML-schema into individually addressable RDF (Resource Description Framework) resources published via the TDWG Terms Wiki (https://terms.tdwg.org/wiki/ABCD_2). In a second step, informal properties and concept-relations described by the original ABCD-schema were transformed into a machine-readable ontology and revised (Güntsch et al. 2016). The project was successfully finished in January 2019. The ABCD 3 setup allows for the creation of standard-conforming application schemas. The XML variant of ABCD 3.0 was restructured, simplified and made more consistent in terms of element names and types as compared to version 2.x. The XML elements are connected to their semantic concepts using the W3C SAWSDL (Semantic Annotation for Web Services Description Language and XML Schema) standard. The creation of specialized applications schemas is encouraged, the first use case was the application schema for zoology. It will also be possible to generate application schemas that break the traditional unit-centric structure of ABCD. Further achievements of the project include creating a Wikibase instance as the editing platform, with related tools for maintenance queries, such as checking for inconsistencies in the ontology and automated export into RDF. This allows for fast iterations of new or updated versions, e.g. when additional mappings to other standards are done. The setup is agnostic to the data standard created, it can therefore also be used to create or model other standards. Mappings to other standards like Darwin Core (https://dwc.tdwg.org/) and Audubon Core (https://tdwg.github.io/ac/) are now machine readable as well. All XPaths (XML Paths) of ABCD 3.0 XML have been mapped to all variants of ABCD 2.06 and 2.1, which will ease transition to the new standard. The ABCD 3 Ontology will also be uploaded to the GFBio Terminology Server (Karam et al. 2016), where individual concepts can be easily searched or queried, allowing for better interactive modelling of ABCD concepts. ABCD documentation now adheres to TDWG’s Standards Documentation Standard (SDS, https://www.tdwg.org/standards/sds/) and is located at https://abcd.tdwg.org/. The new site is hosted on Github: https://github.com/tdwg/abcd/tree/gh-pages.

Download Full-text

Application Profiles

Advances in Web Technologies and Engineering - Developing Metadata Application Profiles ◽

10.4018/978-1-5225-2221-8.ch001 ◽

2017 ◽

pp. 1-15

Author(s):

Karen Coyle

Keyword(s):

Resource Description Framework ◽

World Wide ◽

Xml Schema ◽

Dublin Core ◽

Xml Documents ◽

Framework Model ◽

The World ◽

Description Framework ◽

Machine Readable ◽

Resource Description

Application profiles fulfill similar functions to other forms of metadata documentation, such as data dictionaries. The preference is for application profiles to be machine-readable and machine-actionable, so that they can provide validation and processing instructions, not unlike XML schema does for XML documents. These goals are behind the work of the Dublin Core Metadata Initiative in the work that has been done over the last decade to develop application profiles for data that uses the Resource Description Framework model of the World Wide Web Consortium.

Download Full-text

A Workflow for the Semantic Annotation of Field Books and Specimen Labels

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25839 ◽

2018 ◽

Vol 2 ◽

pp. e25839

Author(s):

Lise Stork ◽

Andreas Weber ◽

Eulàlia Miracle ◽

Katherine Wolstencroft

Keyword(s):

Natural History ◽

Knowledge Base ◽

Web Application ◽

Semantic Annotation ◽

Open Data ◽

Linked Open Data ◽

Easy Access ◽

Observation Data ◽

Natural History Collections ◽

Darwin Core

Geographical and taxonomical referencing of specimens and documented species observations from within and across natural history collections is vital for ongoing species research. However, much of the historical data such as field books, diaries and specimens, are challenging to work with. They are computationally inaccessable, refer to historical place names and taxonomies, and are written in a variety of languages. In order to address these challenges and elucidate historical species observation data, we developed a workflow to (i) crowd-source semantic annotations from handwritten species observations, (ii) transform them into RDF (Resource Description Framework) and (iii) store and link them in a knowledge base. Instead of full-transcription we directly annotate digital field books scans with key concepts that are based on Darwin Core standards. Our workflow stresses the importance of verbatim annotation. The interpretation of the historical content, such a resolving a historical taxon to a current one, can be done by individual researchers after the content is published as linked open data. Through the storage of annotion provenance, who created the annotation and when, we allow multiple interpretations of the content to exist in parallel, stimulating scientific discourse. The semantic annotation process is supported by a web application, the Semantic Field Book (SFB)-Annotator, driven by an application ontology. The ontology formally describes the content and meta-data required to semantically annotate species observations. It is based on the Darwin Core standard (DwC), Uberon and the Geonames ontology. The provenance of annotations is stored using the Web Annotation Data Model. Adhering to the principles of FAIR (Findable, Accessible, Interoperable & Reusable) and Linked Open Data, the content of the specimen collections can be interpreted homogeneously and aggregated across datasets. This work is part of the Making Sense project: makingsenseproject.org. The project aims to disclose the content of a natural history collection: a 17,000 page account of the exploration of the Indonesian Archipelago between 1820 and 1850 (Natuurkundige Commissie voor Nederlands-Indie) With a knowledge base, researchers are given easy access to the primary sources of natural history collections. For their research, they can aggregate species observations, construct rich queries to browse through the data and add their own interpretations regarding the meaning of the historical content.

Download Full-text

PLATFORMIZING KNOWLEDGE: MESS AND MEANING IN WEB 3.0 INFRASTRUCTURES

AoIR Selected Papers of Internet Research ◽

10.5210/spir.v2020i0.11128 ◽

2020 ◽

Author(s):

Andrew Iliadis ◽

Wesley Stevens ◽

Jean-Christophe Plantin ◽

Amelia Acker ◽

Huw Davies ◽

...

Keyword(s):

Semantic Web ◽

Information Exchange ◽

Semantic Annotation ◽

Data Access ◽

Web 3.0 ◽

Fine Grained ◽

The Status ◽

Nature Of Information ◽

Data Portability ◽

Machine Readable

This panel focuses on the way that platforms have become key players in the representation of knowledge. Recently, there have been calls to combine infrastructure and platform-based frameworks to understand the nature of information exchange on the web through digital tools for knowledge sharing. The present panel builds and extends work on platform and infrastructure studies in what has been referred to as “knowledge as programmable object” (Plantin, et al., 2018), specifically focusing on how metadata and semantic information are shaped and exchanged in specific web contexts. As Bucher (2012; 2013) and Helmond (2015) show, data portability in the context of web platforms requires a certain level of semantic annotation. Semantic interoperability is the defining feature of so-called "Web 3.0"—traditionally referred to as the semantic web (Antoniou et al, 2012; Szeredi et al, 2014). Since its inception, the semantic web has privileged the status of metadata for providing the fine-grained levels of contextual expressivity needed for machine-readable web data, and can be found in products as diverse as Google's Knowledge Graph, online research repositories like Figshare, and other sources that engage in platformizing knowledge. The first paper in this panel examines the international Schema.org collaboration. The second paper investigates the epistemological implications when platforms organize data sharing. The third paper argues for the use of patents to inform research methodologies for understanding knowledge graphs. The fourth paper discusses private platforms’ extraction and collection of user metadata and the enclosure of data access.

Download Full-text

Enabling RDF Stream Processing for Sensor Data Management in the Environmental Domain

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2016100101 ◽

2016 ◽

Vol 12 (4) ◽

pp. 1-21 ◽

Cited By ~ 9

Author(s):

Alejandro Llaves ◽

Oscar Corcho ◽

Peter Taylor ◽

Kerry Taylor

Keyword(s):

Semantic Annotation ◽

Distributed Processing ◽

Stream Processing ◽

Sensor Data ◽

Message Processing ◽

Observation Data ◽

Data Formats ◽

Average Latency ◽

Environmental Sensor ◽

Rdf Graphs

This paper presents a generic approach to integrate environmental sensor data efficiently, allowing the detection of relevant situations and events in near real-time through continuous querying. Data variety is addressed with the use of the Semantic Sensor Network ontology for observation data modelling, and semantic annotations for environmental phenomena. Data velocity is handled by distributing sensor data messaging and serving observations as RDF graphs on query demand. The stream processing engine presented in the paper, morph-streams++, provides adapters for different data formats and distributed processing of streams in a cluster. An evaluation of different configurations for parallelization and semantic annotation parameters proves that the described approach reduces the average latency of message processing in some cases.

Download Full-text

Analysis of the Machine-Readable Version of the Tonantzintla Catalogue of the Pleiades Flare Stars

International Astronomical Union Colloquium ◽

10.1017/s0252921100154028 ◽

1989 ◽

Vol 104 (2) ◽

pp. 147-150

Author(s):

M.K. Tsvetkov ◽

K.Y. Stavrev ◽

K.P. Tsvetkova

Keyword(s):

Astronomical Observatory ◽

International Programme ◽

Intensive Observations ◽

Flare Stars ◽

The Creation ◽

Machine Readable

The results of the intensive observations of the Pleiades flare stars for more than 20 years (≈ 3000 h effective observational time) were compiled and published by G.Haro and collaborators in a catalogue (Haro et al.,1982) including the flare stars discovered in the Pleiades region up to 1981. The catalogue contains the data far 519 flare stars.In the frame of the international programme for the study of flare stars in stellar aggregates the Department of Astronomy and the National Astronomical Observatory have carried out patrol observations in the Pleiades since 1979. Parallel to the observations, work on the creation of machine-readable versions of the published flare stars catalogues has begun. As a firatstep, a machine—readable version of the Tonantzintla catalogue (TC of the Pleiades flare stars has been prepared (Tsvetkov et al., 1987).

Download Full-text

The health care and life sciences community profile for dataset descriptions

10.7287/peerj.preprints.1982v1 ◽

2016 ◽

Author(s):

Michel Dumontier ◽

Alasdair J G Gray ◽

M. Scott Marshall ◽

Vladimir Alexiev ◽

Peter Ansell ◽

...

Keyword(s):

Health Care ◽

Life Sciences ◽

Scientific Data ◽

Functional Requirements ◽

High Quality ◽

Community Profile ◽

Value Sets ◽

Description Framework ◽

Machine Readable ◽

Resource Description

Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets.

Download Full-text

A Survey of Geospatial Semantic Web for Cultural Heritage

Heritage ◽

10.3390/heritage2020093 ◽

2019 ◽

Vol 2 (2) ◽

pp. 1471-1498 ◽

Cited By ~ 3

Author(s):

Ikrom Nishanbaev ◽

Erik Champion ◽

David A. McMeekin

Keyword(s):

Semantic Web ◽

Cultural Heritage ◽

Open Source ◽

Web Based ◽

Digital Cultural Heritage ◽

New Ideas ◽

Description Framework ◽

Machine Readable ◽

Resource Description

The amount of digital cultural heritage data produced by cultural heritage institutions is growing rapidly. Digital cultural heritage repositories have therefore become an efficient and effective way to disseminate and exploit digital cultural heritage data. However, many digital cultural heritage repositories worldwide share technical challenges such as data integration and interoperability among national and regional digital cultural heritage repositories. The result is dispersed and poorly-linked cultured heritage data, backed by non-standardized search interfaces, which thwart users’ attempts to contextualize information from distributed repositories. A recently introduced geospatial semantic web is being adopted by a great many new and existing digital cultural heritage repositories to overcome these challenges. However, no one has yet conducted a conceptual survey of the geospatial semantic web concepts for a cultural heritage audience. A conceptual survey of these concepts pertinent to the cultural heritage field is, therefore, needed. Such a survey equips cultural heritage professionals and practitioners with an overview of all the necessary tools, and free and open source semantic web and geospatial semantic web platforms that can be used to implement geospatial semantic web-based cultural heritage repositories. Hence, this article surveys the state-of-the-art geospatial semantic web concepts, which are pertinent to the cultural heritage field. It then proposes a framework to turn geospatial cultural heritage data into machine-readable and processable resource description framework (RDF) data to use in the geospatial semantic web, with a case study to demonstrate its applicability. Furthermore, it outlines key free and open source semantic web and geospatial semantic platforms for cultural heritage institutions. In addition, it examines leading cultural heritage projects employing the geospatial semantic web. Finally, the article discusses attributes of the geospatial semantic web that require more attention, that can result in generating new ideas and research questions for both the geospatial semantic web and cultural heritage fields.

Download Full-text

An Approach to the Creation and Presentation of Reference Gesture Datasets, for the Preservation of Traditional Crafts

Applied Sciences ◽

10.3390/app10207325 ◽

2020 ◽

Vol 10 (20) ◽

pp. 7325

Author(s):

Nikolaos Partarakis ◽

Xenophon Zabulis ◽

Antonis Chatziantoniou ◽

Nikolaos Patsiouras ◽

Ilia Adami

Keyword(s):

Computer Vision ◽

Motion Capture ◽

Human Activities ◽

Wide Spectrum ◽

Semantic Annotation ◽

Human Motion ◽

Digital Data ◽

3D Motion ◽

Pertinent Data ◽

The Creation

A wide spectrum of digital data are becoming available to researchers and industries interested in the recording, documentation, recognition, and reproduction of human activities. In this work, we propose an approach for understanding and articulating human motion recordings into multimodal datasets and VR demonstrations of actions and activities relevant to traditional crafts. To implement the proposed approach, we introduce Animation Studio (AnimIO) that enables visualisation, editing, and semantic annotation of pertinent data. AnimIO is compatible with recordings acquired by Motion Capture (MoCap) and Computer Vision. Using AnimIO, the operator can isolate segments from multiple synchronous recordings and export them in multimodal animation files. AnimIO can be used to isolate motion segments that refer to individual craft actions, as described by practitioners. The proposed approach has been iteratively designed for use by non-experts in the domain of 3D motion digitisation.

Download Full-text

Construction of coffee transcriptome networks based on gene annotation semantics

Journal of Integrative Bioinformatics ◽

10.1515/jib-2012-205 ◽

2012 ◽

Vol 9 (3) ◽

pp. 80-92 ◽

Cited By ~ 1

Author(s):

Luis F. Castillo ◽

Narmer Galeano ◽

Gustavo A. Isaza ◽

Alvaro Gaitan

Keyword(s):

Gene Networks ◽

Gene Annotation ◽

Expression Patterns ◽

Biological Significance ◽

Local Alignment ◽

Large Gene ◽

Semantic Concepts ◽

Pathways Analysis ◽

Gene Models ◽

Description Framework

Summary Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST) and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF) using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

Download Full-text

Creating a Metadata-Enabled Framework for Resource Discovery in Knowledge Bases

Proceedings of the Annual Conference of CAIS / Actes du congrès annuel de l'ACSI ◽

10.29173/cais15 ◽

2013 ◽

Author(s):

Lynne C. Howarth

Keyword(s):

Knowledge Bases ◽

Resource Discovery ◽

General Purpose ◽

Domain Specific ◽

Metadata Standards ◽

Three Phase ◽

Description Framework ◽

Machine Readable ◽

High Degree ◽

Design And Testing

With the proliferation of digitized resources accessible via Internet and Intranet knowledge bases, and a pressing need to develop more sophisticated tools for the identification and retrieval of electronic resources, both general purpose and domain-specific metadata schemes have assumed a particular prominence. While recent work emanating from the World Wide Web Consortium (W3C) has focused on the Resource Description Framework (RDF), and metadata maps or Acrosswalks” have been created to support the interoperability of metadata standards -- thus converting metatags from diverse domains from simply “machine-readable” to “machine-understandable”-- the next iteration, to “human-understandable,” remains a challenge. This apparent gap provides a framework for three-phase research (Howarth, 2000, 1999) to develop a tool which will provide a “human-understandable” front-end search assist to any XML-compliant metadata scheme. Findings from phase one, the analyses and mapping of seven metadata schemes, identify the particular challenges of designing a common “namespace”, populated with element tags which are appropriately descriptive, yet readily understood by a lay searcher, when there is little congruence within, and a high degree of variability across, the metadata schemes under study. Implications for the subsequent design and testing of both the proposed “metalevel ontology” (phase two), and the prototype search assist tool (phase three) are examined.

Download Full-text