Using SPARQL to access Linked Open Data

A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance

Applied Sciences ◽

10.3390/app11052405 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2405

Author(s):

Yuxiang Sun ◽

Tianyi Zhao ◽

Seulgi Yoon ◽

Yongju Lee

Keyword(s):

Flash Memory ◽

Query Language ◽

Hybrid Approach ◽

Open Data ◽

Main Memory ◽

Linked Open Data ◽

Index Structure ◽

Identification Algorithm ◽

Distributed Computing Systems ◽

Query Performance

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.

Download Full-text

Modern Users of Libraries and the Linked Open Data Environment

Bibliotekovedenie [Library and Information Science (Russia)] ◽

10.25281/0869-608x-2020-69-3-243-260 ◽

2020 ◽

Vol 69 (3) ◽

pp. 243-260

Author(s):

Olga A. Lavrenova ◽

Andrey A. Vinberg

Keyword(s):

Classification System ◽

Query Language ◽

Open Data ◽

Basic Research ◽

Linked Open Data ◽

Knowledge Organization ◽

Global Network ◽

Knowledge Organization System ◽

Description Framework ◽

State Library

The goal of any library is to ensure high quality and general availability of information retrieval tools. The paper describes the project implemented by the Russian State Library (RSL) to present Library Bibliographic Classification as a Networked Knowledge Organization System. The project goal is to support content and provide tools for ensuring system’s interoperability with other resources of the same nature (i.e. with Linked Data Vocabularies) in the global network environment. The project was partially supported by the Russian Foundation for Basic Research (RFBR).The RSL General Classified Catalogue (GCC) was selected as the main data source for the Classification system of knowledge organization. The meaning of each classification number is expressed by complete string of wordings (captions), rather than the last level caption alone. Data converted to the Resource Description Framework (RDF) files based on the standard set of properties defined in the Simple Knowledge Organization System (SKOS) model was loaded into the semantic storage for subsequent data processing using the SPARQL query language. In order to enrich user queries for search of resources, the RSL has published its Classification System in the form of Linked Open Data (https://lod.rsl.ru) for searching in the RSL electronic catalogue. Currently, the work is underway to enable its smooth integration with other LOD vocabularies. The SKOS mapping tags are used to differentiate the types of connections between SKOS elements (concepts) existing in different concept schemes, for example, UDC, MeSH, authority data.The conceptual schemes of the leading classifications are fundamentally different from each other. Establishing correspondence between concepts is possible only on the basis of lexical and structural analysis to compute the concept similarity as a combination of attributes.The authors are looking forward to working with libraries in Russia and other countries to create a common space of Linked Open Data vocabularies.

Download Full-text

Introduction to the Principles of Linked Open Data

The Programming Historian ◽

10.46430/phen0068 ◽

2017 ◽

Author(s):

Jonathan Blaney

Keyword(s):

Query Language ◽

Open Data ◽

Linked Open Data ◽

Graph Query ◽

Graph Query Language ◽

Core Concepts

Introduces core concepts of Linked Open Data, including URIs, ontologies, RDF formats, and a gentle intro to the graph query language SPARQL.

Download Full-text

Medieval manuscripts and their migrations: Using SPARQL to investigate the research potential of an aggregated Knowledge Graph

Digital Medievalist ◽

10.16995/dm.8064 ◽

2021 ◽

Author(s):

Hanno Wijsman ◽

Toby Burrows ◽

Laura Cleaver ◽

Doug Emery ◽

Eero Hyvönen ◽

...

Keyword(s):

Query Language ◽

Open Data ◽

Literacy Skills ◽

Linked Open Data ◽

Knowledge Graph ◽

Manuscript Culture ◽

The People ◽

Computer Scientists ◽

Data Environment ◽

Manuscript Description

Although the RDF query language SPARQL has a reputation for being opaque and difficult for traditional humanists to learn, it holds great potential for opening up vast amounts of Linked Open Data to researchers willing to take on its challenges. This is especially true in the field of premodern manuscripts studies as more and more datasets relating to the study of manuscript culture are made available online. This paper explores the results of a two-year long process of collaborative learning and knowledge transfer between the computer scientists and humanities researchers from the Mapping Manuscript Migrations (MMM) project to learn and apply SPARQL to the MMM dataset. The process developed into a wider investigation of the use of SPARQL to analyse the data, refine research questions, and assess the research potential of the MMM aggregated dataset and its Knowledge Graph. Through an examination of a series of six SPARQL query case studies, this paper will demonstrate how the process of learning and applying SPARQL to query the MMM dataset returned three important and unexpected results: 1) a better understanding of a complex and imperfect dataset in a Linked Open Data environment, 2) a better understanding of how manuscript description and associated data involving the people and institutions involved in the production, reception, and trade of premodern manuscripts needs to be presented to better facilitate computational research, and 3) an awareness of need to further develop data literacy skills among researchers in order to take full advantage of the wealth of unexplored data now available to them in the Semantic Web.

Download Full-text

TOWARDS AN EFFICIENT RDF DATASET SLICING

International Journal of Semantic Computing ◽

10.1142/s1793351x13400151 ◽

2013 ◽

Vol 07 (04) ◽

pp. 455-477 ◽

Cited By ~ 2

Author(s):

EDGARD MARX ◽

TOMMASO SORU ◽

SAEEDEH SHEKARPOUR ◽

SÖREN AUER ◽

AXEL-CYRILLE NGONGA NGOMO ◽

...

Keyword(s):

Information Needs ◽

Query Language ◽

Open Data ◽

Linked Open Data ◽

Connected Subgraph ◽

Triple Store ◽

Subgraph Pattern ◽

Order Of Magnitude ◽

Efficient Processing ◽

Description Framework

Over the last years, a considerable amount of structured data has been published on the Web as Linked Open Data (LOD). Despite recent advances, consuming and using Linked Open Data within an organization is still a substantial challenge. Many of the LOD datasets are quite large and despite progress in Resource Description Framework (RDF) data management their loading and querying within a triple store is extremely time-consuming and resource-demanding. To overcome this consumption obstacle, we propose a process inspired by the classical Extract-Transform-Load (ETL) paradigm. In this article, we focus particularly on the selection and extraction steps of this process. We devise a fragment of SPARQL Protocol and RDF Query Language (SPARQL) dubbed SliceSPARQL, which enables the selection of well-defined slices of datasets fulfilling typical information needs. SliceSPARQL supports graph patterns for which each connected subgraph pattern involves a maximum of one variable or Internationalized resource identifier (IRI) in its join conditions. This restriction guarantees the efficient processing of the query against a sequential dataset dump stream. Furthermore, we evaluate our slicing approach on three different optimization strategies. Results show that dataset slices can be generated an order of magnitude faster than by using the conventional approach of loading the whole dataset into a triple store.

Download Full-text

TOWARD A LINKED OPEN DATA REPOSITORY ABOUT VIETNAMESE TOURISM

KỶ YẾU HỘI NGHỊ KHOA HỌC CÔNG NGHỆ QUỐC GIA LẦN THỨ XI NGHIÊN CỨU CƠ BẢN VÀ ỨNG DỤNG CÔNG NGHỆ THÔNG TIN ◽

10.15625/vap.2018.00067 ◽

2018 ◽

Author(s):

Le Anh Tien ◽

Cao Tuan Dung

Keyword(s):

Open Data ◽

Linked Open Data ◽

Data Repository

Download Full-text

Opportunités et défis. Linked (Open) Data

Dialogues avec la machine - Arabesques ◽

10.35562/arabesques.1397 ◽

2019 ◽

pp. 4-6

Author(s):

Makx Dekkers

Keyword(s):

Open Data ◽

Linked Open Data

Download Full-text

Europeana no Linked Open Data: conceitos de Web Semântica na dimensão aplicada das humanidades digitais

Pesquisa Brasileira em Ciência da Informação e Biblioteconomia ◽

10.22478/ufpb.1981-0695.2017v12n2.36529 ◽

2017 ◽

Vol 12 (2) ◽

Author(s):

Caio Saraiva Coneglian ◽

José Eduardo Santarem Segundo

Keyword(s):

Linked Data ◽

Open Data ◽

Linked Open Data

O surgimento de novas tecnologias, tem introduzido meios para a divulgação e a disponibilização das informações mais eficientemente. Uma iniciativa, chamada de Europeana, vem promovendo esta adaptação dos objetos informacionais dentro da Web, e mais especificamente no Linked Data. Desta forma, o presente estudo tem como objetivo apresentar uma discussão acerca da relação entre as Humanidades Digitais e o Linked Open Data, na figura da Europeana. Para tal, utilizamos uma metodologia exploratória e que busca explorar as questões relacionadas ao modelo de dados da Europeana, EDM, por meio do SPARQL. Como resultados, compreendemos as características do EDM, pela utilização do SPARQL. Identificamos, ainda, a importância que o conceito de Humanidades Digitais possui dentro do contexto da Europeana.Palavras-chave: Web semântica. Linked open data. Humanidades digitais. Europeana. EDM.Link: https://periodicos.ufsc.br/index.php/eb/article/view/1518-2924.2017v22n48p88/33031

Download Full-text