scholarly journals Linked Data Tools for Managing Taxonomic Databases

Author(s):  
Johan Liljeblad ◽  
Tapani Lahti ◽  
Matts Djos

Taxonomic information is dynamic, i.e. changes are made continuously, so scientific names are insufficient to track changes in taxon circumscription. The principles of Linked Open Data (LOD), as defined by the World Wide Web Consortium, can be applied for documenting the relationships of taxon circumscriptions over time and between checklists of taxa. In our scheme, each checklist and each taxon in the checklist is assigned a globally unique, persistent identifier. According to the LOD principles, HTTP Uniform Resource Identifiers (URIs) are used as identifiers, providing both human-readable (HTML) and machine-readable (XML) responses for client requests. Common vocabularies are needed in machine-readable responses to HTTP URIs. We use SKOS (Simple Knowledge Organization System) as a basic vocabulary for describing checklists as instances of class skos:ConceptScheme, and taxa as instances of class skos:Concept. Set relationships between taxon circumscriptions are described using the properties skos:broader and skos:narrower. Darwin Core vocabulary is used for describing taxon properties, such as scientific names, taxonomic ranks and authorship string, in the checklists. Instead of directly linking taxon circumscriptions between checklists, we define a HTTP URI for each unique circumscription. This common identifier is then mapped to internal checklist identifiers matching the circumscription using the property skos:exactMatch. For the management of the URIs, the domain name TAXONID.ORG has been registered. In a pilot study, our approach has been applied to linking taxon circumscriptions of selected taxa between the national checklists of Sweden and Finland. In the future, national checklists from other Nordic/Baltic countries (Norway, Denmark, Iceland, Estonia) can be easily linked together as well. The work is part of the NeIC DeepDive project (neic.no).

Author(s):  
Ioannis Papadakis ◽  
Konstantinos Kyprianos

One of the most important tasks of a librarian is the assignment of appropriate subject(s) to a resource within a library’s collection. The subjects usually belong to a controlled vocabulary that is specifically designed for such a task. The most widely adopted controlled vocabulary across libraries around the world is the Library of Congress Subject Headings (LCSH). However, there seems to be a shifting from traditional LCSH to modern thesauri. In this paper, a methodology is proposed, capable of incorporating thesauri into existing LCSH-based Information Retrieval–IR systems. In order to achieve this, a mapping methodology is proposed capable of providing a common structure consisting of terms belonging to LCSH and/or a thesaurus. The structure is modeled as a Simple Knowledge Organization System (SKOS) ontology, which can be employed by appropriate subject-based IR systems. As a proof of concept, the proposed methodology is applied to the DSpace-based University of Piraeus digital library.


2011 ◽  
Vol 7 (3) ◽  
pp. 74-90 ◽  
Author(s):  
Ioannis Papadakis ◽  
Konstantinos Kyprianos

One of the most important tasks of a librarian is the assignment of appropriate subject(s) to a resource within a library’s collection. The subjects usually belong to a controlled vocabulary that is specifically designed for such a task. The most widely adopted controlled vocabulary across libraries around the world is the Library of Congress Subject Headings (LCSH). However, there seems to be a shifting from traditional LCSH to modern thesauri. In this paper, a methodology is proposed, capable of incorporating thesauri into existing LCSH-based Information Retrieval–IR systems. In order to achieve this, a mapping methodology is proposed capable of providing a common structure consisting of terms belonging to LCSH and/or a thesaurus. The structure is modeled as a Simple Knowledge Organization System (SKOS) ontology, which can be employed by appropriate subject-based IR systems. As a proof of concept, the proposed methodology is applied to the DSpace-based University of Piraeus digital library.


2020 ◽  
Vol 32 ◽  
Author(s):  
Rodrigo Oliveira ZACARIAS ◽  
Mark Douglas de Azevedo JACYNTHO ◽  
Aline Pires Vieira de VASCONCELOS

Resumo Problemas relacionados à especificação de requisitos, tais como ambiguidade e incompletude, ainda são recorrentes nos processos de desenvolvimento de sistemas de informação. O reúso de requisitos é um dos mecanismos que podem auxiliar na redução desses contratempos. Nesse sentido, o objetivo deste trabalho é propor um método para a criação e para a publicação de tesauros semânticos de requisitos para reúso, utilizando tecnologias e padrões da Web Semântica e de acordo com os princípios Linked Data. Para descrição formal desses tesauros, a ontologia central utilizada é a Simple Knowledge Organization System. Esse modelo ontológico fornece um conjunto de axiomas e de propriedades voltados para criação de tesauros, permitindo documentar de forma precisa e fidedigna, em um grafo de conhecimento, a definição, a hierarquia e outros inter-relacionamentos entre os requisitos de um sistema. Também é apresentado um protótipo de Web service que funciona como repositório para reúso e demonstra o método na prática. É descrito um estudo sobre a viabilidade da implementação da proposta realizado com profissionais, em que foram promovidos uma discussão em grupo e um posterior preenchimento individual de um questionário de avaliação. O estudo obteve resultados favoráveis, em sua maioria, e algumas sugestões de melhoria foram apontadas. Os participantes consideraram a proposta relevante para a Engenharia de Requisitos e com potencial de expansão, uma vez que as diretrizes apresentadas permitem a criação de novos tipos de inferência e navegabilidade sobre os requisitos armazenados.


2021 ◽  
pp. 016555152110181
Author(s):  
Alberto Nogales ◽  
Miguel-Angel Sicilia ◽  
Álvaro J García-Tejedor

The publication of large amounts of open data is an increasing trend. This is a consequence of initiatives like Linked Open Data (LOD) that aims at publishing and linking data sets published in the World Wide Web. Linked Data publishers should follow a set of principles for their task. This information is described in a 2011 document that includes the consideration of reusing vocabularies as key. The Linked Open Vocabularies (LOV) project attempts to collect the vocabularies and ontologies commonly used in LOD. These ontologies have been classified by domain following the criteria of LOV members, thus having the disadvantage of introducing personal biases. This article presents an automatic classifier of ontologies based on the main categories appearing in Wikipedia. For that purpose, word-embedding models are used in combination with deep learning techniques. Results show that with a hybrid model of regular Deep Neural Networks (DNNs), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), classification could be made with an accuracy of 93.57%. A further evaluation of the domain matchings between LOV and the classifier brings possible matchings in 79.8% of the cases.


Author(s):  
Johan Liljeblad ◽  
Tapani Lahti

Starting with Finland and Sweden and a subset of taxonomic groups, the Nordic/Baltic countries are connecting national checklists using Linked Open Data standards (Auer et al. 2007) and agreed vocabularies. We use HTTP Uniform Resource Identifiers as globally unique, persistent identifiers for taxon concepts (Chawuthai et al. 2013). Currently, we provide both human-readable (html) and machine-readable (xml) responses for client requests via a central checklist, TAXONID.ORG, which in itself needs to be managed. However, we hope this can be replaced by Catalogue of Life Plus in a not too distant future. While initially exchanging taxonomic information, our goal is ultimately to share information also on genetics, images and traits as well as conservation status and observations in a standardized way. The work is part of the NeIC DeepDive project which is funded by the Nordic e-Infrastructure Collaboration (neic.no/deepdive). The vision is to establish a regional infrastructure network consisting of Nordic and Baltic data centers and information systems and to provide seamlessly operating regional data services, tools, and virtual laboratories.


2020 ◽  
Vol 9 (2) ◽  
pp. e092
Author(s):  
Kazumi Tomoyose ◽  
Ana Carolina Simionato Arakaki

With the availability of information in the World Wide Web its access and retrieval by the users is facilitated, and the Library and Information Science (LIS) field’s knowledge and techniques can be applied to this environment in order to help with the process. The present study is descriptive, qualitative and exploratory, based on bibliographical sources, in which it was explored how the Classification discipline interacts with Linked Data, focusing on the analysis of Dewey Linked Data. From four catalogs analyzed, referred to in the literature as adhering to Dewey Linked Data, only two actually has links in their records redirecting to the system. Despite this, its presence in The Linked Open Data Cloud appears as a positive factor in its dissemination, since it boosts its visibility. It is concluded that the Classification discipline allows the thematic standardization of information resources, so that there is uniformity in the Web environment and quality retrieval of information, while promoting interoperability between data in the Linked Data context. The standardization of metadata values using classifications optimizes the representation of information and its retrieval in the Web, while also providing the reuse of data. In addition, studies that align the area of Library and Information Science with the Semantic Web and its technologies can provide new perspectives for the area, as well as contemplate the users’ always changing needs, thus, fulfilling the objective of the field.


2013 ◽  
Vol 25 (2) ◽  
pp. 145-150 ◽  
Author(s):  
Marilda Lopes Ginez de Lara

The aim of this study was to discuss the need for formal documentary languages as a condition for it to function in the Semantic Web. Based on a bibliographic review, Linked Open Data is presented as an initial condition for the operationalization of the Semantic Web, similar to the movement of Linked Open Vocabularies that aimed to promote interoperability among vocabularies. We highlight the Simple Knowledge Organization System format by analyzing its main characteristics and presenting the new standard ISO 25964-1/2:2011/2012 -Thesauri and interoperability with other vocabularies, that revises previous recommendations, adding requirements for the interoperability and mapping of vocabularies. We discuss conceptual problems in the formalization of vocabularies and the need to invest critically in its operationalization, suggesting alternatives to harness the mapping of vocabularies.


2017 ◽  
Vol 41 (2) ◽  
pp. 252-271 ◽  
Author(s):  
Alberto Nogales ◽  
Miguel Angel Sicilia-Urban ◽  
Elena García-Barriocanal

Purpose This paper reports on a quantitative study of data gathered from the Linked Open Vocabularies (LOV) catalogue, including the use of network analysis and metrics. The purpose of this paper is to gain insights into the structure of LOV and the use of vocabularies in the Web of Data. It is important to note that not all the vocabularies in it are registered in LOV. Given the de-centralised and collaborative nature of the use and adoption of these vocabularies, the results of the study can be used to identify emergent important vocabularies that are shaping the Web of Data. Design/methodology/approach The methodology is based on an analytical approach to a data set that captures a complete snapshot of the LOV catalogue dated April 2014. An initial analysis of the data is presented in order to obtain insights into the characteristics of the vocabularies found in LOV. This is followed by an analysis of the use of Vocabulary of a Friend properties that describe relations among vocabularies. Finally, the study is complemented with an analysis of the usage of the different vocabularies, and concludes by proposing a number of metrics. Findings The most relevant insight is that unsurprisingly the vocabularies with more presence are those used to model Semantic Web data, such as Resource Description Framework, RDF Schema and OWL, as well as broadly used standards as Simple Knowledge Organization System, DCTERMS and DCE. It was also discovered that the most used language is English and the vocabularies are not considered to be highly specialised in a field. Also, there is not a dominant scope of the vocabularies. Regarding the structural analysis, it is concluded that LOV is a heterogeneous network. Originality/value The paper provides an empirical analysis of the structure of LOV and the relations between its vocabularies, together with some metrics that may be of help to determine the important vocabularies from a practical perspective. The results are of interest for a better understanding of the evolution and dynamics of the Web of Data, and for applications that attempt to retrieve data in the Linked Data Cloud. These applications can benefit from the insights into the important vocabularies to be supported and the value added when mapping between and using the vocabularies.


Author(s):  
Amina Meherehera ◽  
Imane Mekideche ◽  
Leila Zemmouchi-Ghomari ◽  
Abdessamed Réda Ghomari

A large amount of data available over the Web and, in particular, the open data have, generally, heterogeneous formats and are not machine-readable. One promising solution to overcome the problems of heterogeneity and automatic interpretation is the Linked Data initiative, which aims to provide unified practices for publishing and contextually to link data on the Web, by using World Wide Web Consortium standards and the Semantic Web technologies. LinkedIn data promote the Web’s transformation from a web of documents to a web of data, ensuring that machines and software agents can interpret the semantics of data correctly and therefore infer new facts and return relevant web data search results. This paper presents an automatic generic transformation approach that manipulates several input formats of open web data to linked open data. This work aims to participate actively in the movement of publishing data compliant with linked data principles.


Author(s):  
Gonzalo Mochón Bezares ◽  
Eva María Méndez Rodríguez ◽  
Gema Bueno de la Fuente

Este estudio examina de forma exhaustiva la literatura científica dedicada a los procesos de skosificación de vocabularios y sistemas de organización del conocimiento. Se analizan en profundidad 49 trabajos que describen y detallan la transformación de un total de 59 vocabularios controlados convencionales o SOC (Sistemas de Organización del Conocimiento) a Simple Knowledge Organization System (SKOS). Se identifican los puntos clave para hacer el análisis de metodologías de transformación de vocabularios en SKOS para la web y se comparan los estudios para determinar las aproximaciones y parámetros más recomendables para llevar a cabo estos procesos de conversión de vocabularios, cada vez más frecuentes y necesarios en la web semántica y en entornos de linked data (LD). Los resultados señalan que la mayor parte de SOC transformados son tesauros, que los formatos mayoritarios son de texto o registros bibliográficos, que el objetivo más común al cambiar a SKOS es la mejora de la interoperabilidad de los vocabularios, y que los procesos de conversión pueden agruparse mediante tres formas: scripts realizados en distintos lenguajes, transformaciones XSL y lenguajes de mapeo. Se concluye queSKOS es considerado por los autores como una buena opción para mejorar la interoperabilidad de vocabularios controlados.


Sign in / Sign up

Export Citation Format

Share Document