Algebraic Operations on Spatiotemporal Data Based on RDF

Lin Zhu; Nan Li; Luyi Bai

doi:10.3390/ijgi9020080

Algebraic Operations on Spatiotemporal Data Based on RDF

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9020080 ◽

2020 ◽

Vol 9 (2) ◽

pp. 80

Author(s):

Lin Zhu ◽

Nan Li ◽

Luyi Bai

Keyword(s):

Algebraic Approach ◽

Spatiotemporal Data ◽

Graph Algebras ◽

Graph Pattern ◽

Data Querying ◽

Filter Operation ◽

Algebraic Operations ◽

Description Framework ◽

Rdf Graphs ◽

Filter Rules

In the context of the Semantic Web, the Resource Description Framework (RDF), a language proposed by W3C, has been used for conceptual description, data modeling, and data querying. The algebraic approach has been proven to be an effective way to process queries, and algebraic operations in RDF have been investigated extensively. However, the study of spatiotemporal RDF algebra has just started and still needs further attention. This paper aims to explore an algebraic operational framework to represent the content of spatiotemporal data and support RDF graphs. To accomplish our study, we defined a spatiotemporal data model based on RDF. On this basis, the spatiotemporal semantics and the spatiotemporal algebraic operations were investigated. We defined five types of graph algebras, and, in particular, the filter operation can filter the spatiotemporal graphs using a graph pattern. Besides this, we put forward a spatiotemporal RDF syntax specification to help users browse, query, and reason with spatiotemporal RDF graphs. The syntax specification illustrates the filter rules, which contribute to capturing the spatiotemporal RDF semantics and provide a number of advanced functions for building data queries.

Download Full-text

AGGREGATE QUERY PROCESSING FOR SEMANTIC WEB DATABASES: AN ALGEBRAIC APPROACH

International Journal of Semantic Computing ◽

10.1142/s1793351x07000226 ◽

2007 ◽

Vol 01 (04) ◽

pp. 479-495

Author(s):

DAWIT SEID ◽

SHARAD MEHROTRA

Keyword(s):

Query Language ◽

Algebraic Approach ◽

Query Languages ◽

Graph Pattern Matching ◽

Graph Pattern ◽

Aggregate Queries ◽

Aggregate Query ◽

The Many ◽

Rdf Graphs ◽

Resource Description

As a growing number of applications represent data as semantic graphs like RDF (Resource Description Format) and the many entity-attribute-value formats, query languages for such data are being required to support operations beyond graph pattern matching and inference queries. Specifically the ability to express aggregate queries is an important feature which is either lacking or is implemented with little attention to the peculiarities of the data model. In this paper, we study the meaning and implementation of grouping and aggregate queries over RDF graphs. We first define grouping and aggregate operators algebraically and then show how the SPARQL query language can be extended to express grouping and aggregate queries.

Download Full-text

Provenance Description of Metadata Vocabularies for the Long-term Maintenance of Metadata

Journal of Data and Information Science ◽

10.1515/jdis-2017-0007 ◽

2017 ◽

Vol 2 (2) ◽

pp. 41-55 ◽

Cited By ~ 2

Author(s):

Chunqiu Li ◽

Shigeo Sugimoto

Keyword(s):

Formal Scheme ◽

Proposed Model ◽

History Of ◽

Crucial Information ◽

Description Framework ◽

Rdf Graphs ◽

Resource Description ◽

Term Maintenance ◽

Practical Implications

Abstract Purpose The purpose of this paper is to discuss provenance description of metadata terms and metadata vocabularies as a set of metadata terms. Provenance is crucial information to keep track of changes of metadata terms and metadata vocabularies for their consistent maintenance. Design/methodology/approach The W3C PROV standard for general provenance description and Resource Description Framework (RDF) are adopted as the base models to formally define provenance description for metadata vocabularies. Findings This paper defines a few primitive change types of metadata terms, and a provenance description model of the metadata terms based on the primitive change types. We also provide examples of provenance description in RDF graphs to show the proposed model. Research limitations The model proposed in this paper is defined based on a few primitive relationships (e.g. addition, deletion, and replacement) between pre-version and post-version of a metadata term. The model is simplified and the practical changes of metadata terms can be more complicated than the primitive relationships discussed in the model. Practical implications Formal provenance description of metadata vocabularies can improve maintainability of metadata vocabularies over time. Conventional maintenance of metadata terms is the maintenance of documents of terms. The proposed model enables effective and automated tracking of change history of metadata vocabularies using simple formal description scheme defined based on widely-used standards. Originality/value Changes in metadata vocabularies may cause inconsistencies in the longterm use of metadata. This paper proposes a simple and formal scheme of provenance description of metadata vocabularies. The proposed model works as the basis of automated maintenance of metadata terms and their vocabularies and is applicable to various types of changes.

Download Full-text

Dynamic Partitioning Supporting Load Balancing for Distributed RDF Graph Stores

Symmetry ◽

10.3390/sym11070926 ◽

2019 ◽

Vol 11 (7) ◽

pp. 926

Author(s):

Kyoungsoo Bok ◽

Junwon Kim ◽

Jaesoo Yoo

Keyword(s):

Load Balancing ◽

Distributed Processing ◽

Data Partitioning ◽

Rdf Graph ◽

Dynamic Partitioning ◽

Usage Frequency ◽

Partitioning Methods ◽

Description Framework ◽

Rdf Graphs ◽

Distributed Server

Various resource description framework (RDF) partitioning methods have been studied for the efficient distributed processing of a large RDF graph. The RDF graph has symmetrical characteristics because subject and object can be used interchangeably if predicate is changed. This paper proposes a dynamic partitioning method of RDF graphs to support load balancing in distributed environments where data insertion and change continue to occur. The proposed method generates clusters and subclusters by considering the usage frequency of the RDF graph that are used by queries as the criteria to perform graph partitioning. It creates a cluster by grouping RDF subgraphs with higher usage frequency while creating a subcluster with lower usage frequency. These clusters and subclusters conduct load balancing by using the mean frequency of queries for the distributed server and conduct graph data partitioning by considering the size of the data stored in each distributed server. It also minimizes the number of edge-cuts connected to clusters and subclusters to minimize communication costs between servers. This solves the problem of data concentration to specific servers due to ongoing data changes and additions and allows efficient load balancing among servers. The performance results show that the proposed method significantly outperforms the existing partitioning methods in terms of query performance time in a distributed server.

Download Full-text

Natural Language Generation from Graphs

International Journal of Semantic Computing ◽

10.1142/s1793351x14500068 ◽

2014 ◽

Vol 08 (03) ◽

pp. 335-384 ◽

Cited By ~ 2

Author(s):

Ngan T. Dong ◽

Lawrence B. Holder

Keyword(s):

Semantic Web ◽

Natural Language ◽

Web Search ◽

Open Data ◽

Natural Language Generation ◽

Language Generation ◽

Expert User ◽

Computer Representation ◽

Description Framework ◽

Rdf Graphs

The Resource Description Framework (RDF) is the primary language to describe information on the Semantic Web. The deployment of semantic web search from Google and Microsoft, the Linked Open Data Community project along with the announcement of schema.org by Yahoo, Bing and Google have significantly fostered the generation of data available in RDF format. Yet the RDF is a computer representation of data and thus is hard for the non-expert user to understand. We propose a Natural Language Generation (NLG) engine to generate English text from a small RDF graph. The Natural Language Generation from Graphs (NLGG) system uses an ontology skeleton, which contains hierarchies of concepts, relationships and attributes, along with handcrafted template information as the knowledge base. We performed two experiments to evaluate NLGG. First, NLGG is tested with RDF graphs extracted from four ontologies in different domains. A Simple Verbalizer is used to compare the results. NLGG consistently outperforms the Simple Verbalizer in all the test cases. In the second experiment, we compare the effort spent to make NLGG and NaturalOWL work with the M-PIRO ontology. Results show that NLGG generates acceptable text with much smaller effort.

Download Full-text

A Novel Adaptive Cuckoo Search for Optimal Query Plan Generation

The Scientific World JOURNAL ◽

10.1155/2014/727658 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 5

Author(s):

Ramalingam Gomathi ◽

Dhandapani Sharmila

Keyword(s):

Semantic Web ◽

Query Optimization ◽

Execution Time ◽

Cuckoo Search ◽

Optimization Methods ◽

Web Data ◽

Query Plan ◽

Plan Generation ◽

Description Framework ◽

Rdf Graphs

The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.

Download Full-text

Analysis of RDF Syntaxes for Semantic Web Development

Applied Computer Systems ◽

10.1515/acss-2015-0017 ◽

2015 ◽

Vol 18 (1) ◽

pp. 33-42 ◽

Cited By ~ 1

Author(s):

Yevgeny Gryaznov ◽

Pavel Rusakov

Keyword(s):

Semantic Web ◽

Directed Graph ◽

Resource Description Framework ◽

Formal Model ◽

Information Representation ◽

Web Development ◽

Rdf Data ◽

Description Framework ◽

Rdf Graphs ◽

Resource Description

Abstract In this paper authors perform a research on possibilities of RDF (Resource Description Framework) syntaxes usage for information representation in Semantic Web. It is described why pure XML cannot be effectively used for this purpose, and how RDF framework solves this problem. Information is being represented in a form of a directed graph. RDF is only an abstract formal model for information representation and side tools are required in order to write down that information. Such tools are RDF syntaxes – concrete text or binary formats, which prescribe rules for RDF data serialization. Text-based RDF syntaxes can be developed on the existing format basis (XML, JSON) or can be an RDF-specific – designed from scratch to serve the only purpose – to serialize RDF graphs. Authors briefly describe some of the RDF syntaxes (both XML and non-XML) and compare them in order to identify strengths and weaknesses of each version. Serialization and deserialization speed tests using Jena library are made. The results from both analytical and experimental parts of this research are used to develop the recommendations for RDF syntaxes usage and to design a RDF/XML syntax subset, which is intended to simplify the development and raise compatibility of information serialized with this RDF syntax.

Download Full-text

Towards Interactive Analytics over RDF Graphs

Algorithms ◽

10.3390/a14020034 ◽

2021 ◽

Vol 14 (2) ◽

pp. 34 ◽

Cited By ~ 1

Author(s):

Maria-Evangelia Papadaki ◽

Nicolas Spyratos ◽

Yannis Tzitzikas

Keyword(s):

Query Language ◽

Proposed Model ◽

Aggregate Queries ◽

Rdf Data ◽

Description Framework ◽

Rdf Graphs ◽

High Level ◽

Resource Description ◽

Functional Algebra ◽

Continuous Accumulation

The continuous accumulation of multi-dimensional data and the development of Semantic Web and Linked Data published in the Resource Description Framework (RDF) bring new requirements for data analytics tools. Such tools should take into account the special features of RDF graphs, exploit the semantics of RDF and support flexible aggregate queries. In this paper, we present an approach for applying analytics to RDF data based on a high-level functional query language, called HIFUN. According to that language, each analytical query is considered to be a well-formed expression of a functional algebra and its definition is independent of the nature and structure of the data. In this paper, we investigate how HIFUN can be used for easing the formulation of analytic queries over RDF data. We detail the applicability of HIFUN over RDF, as well as the transformations of data that may be required, we introduce the translation rules of HIFUN queries to SPARQL and we describe a first implementation of the proposed model.

Download Full-text

An integration approach of multi-source heterogeneous fuzzy spatiotemporal data based on RDF

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201258 ◽

2021 ◽

Vol 40 (1) ◽

pp. 1065-1082

Author(s):

Luyi Bai ◽

Nan Li ◽

Huilei Bai

Keyword(s):

World Wide ◽

Integration Method ◽

Graph Model ◽

Spatiotemporal Data ◽

Rdf Graph ◽

Integration Approach ◽

The World ◽

Description Framework ◽

Resource Description ◽

Significant Superiority

With the growing importance of the fuzzy spatiotemporal data in information application, there is an increasing need for researching on the integration method of multi-source heterogeneous fuzzy spatiotemporal data. In this paper, we first propose a fuzzy spatiotemporal RDF graph model based on RDF (Resource Description Framework) that proposed by the World Wide Web Consortium (W3C) to represent data in triples (subject, predicate, object). Secondly, we analyze and classify the related heterogeneous problems of multi-source heterogeneous fuzzy spatiotemporal data, and use the fuzzy spatiotemporal RDF graph model to define the corresponding rules to solve these heterogeneous problems. In addition, based on the characteristics of RDF triples, we analyze the heterogeneous problem of multi-source heterogeneous fuzzy spatiotemporal data integration in RDF triples, and provide the integration methods FRDFG in this paper. Finally, we report our experiments results to validate our approach and show its significant superiority.

Download Full-text

RDF graph mining for cluster-based theme identification

International Journal of Web Information Systems ◽

10.1108/ijwis-10-2019-0048 ◽

2020 ◽

Vol 16 (2) ◽

pp. 223-247

Author(s):

Siham Eddamiri ◽

Asmaa Benghabrit ◽

Elmoukhtar Zemmouri

Keyword(s):

Graph Mining ◽

Language Models ◽

Data Sets ◽

Content Type ◽

Feature Vectors ◽

Rdf Graph ◽

Data Process ◽

Description Framework ◽

Theme Identification ◽

Rdf Graphs

Purpose The purpose of this paper is to present a generic pipeline for Resource Description Framework (RDF) graph mining to provide a comprehensive review of each step in the knowledge discovery from data process. The authors also investigate different approaches and combinations to extract feature vectors from RDF graphs to apply the clustering and theme identification tasks. Design/methodology/approach The proposed methodology comprises four steps. First, the authors generate several graph substructures (Walks, Set of Walks, Walks with backward and Set of Walks with backward). Second, the authors build neural language models to extract numerical vectors of the generated sequences by using word embedding techniques (Word2Vec and Doc2Vec) combined with term frequency-inverse document frequency (TF-IDF). Third, the authors use the well-known K-means algorithm to cluster the RDF graph. Finally, the authors extract the most relevant rdf:type from the grouped vertices to describe the semantics of each theme by generating the labels. Findings The experimental evaluation on the state of the art data sets (AIFB, BGS and Conference) shows that the combination of Set of Walks-with-backward with TF-IDF and Doc2vec techniques give excellent results. In fact, the clustering results reach more than 97% and 90% in terms of purity and F-measure, respectively. Concerning the theme identification, the results show that by using the same combination, the purity and F-measure criteria reach more than 90% for all the considered data sets. Originality/value The originality of this paper lies in two aspects: first, a new machine learning pipeline for RDF data is presented; second, an efficient process to identify and extract relevant graph substructures from an RDF graph is proposed. The proposed techniques were combined with different neural language models to improve the accuracy and relevance of the obtained feature vectors that will be fed to the clustering mechanism.

Download Full-text

A Constraint Framework for Uncertain Spatiotemporal Data in RDF Graphs

Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-32456-8_79 ◽

2019 ◽

pp. 727-735

Author(s):

Jinyao Wang ◽

Xiaofeng Di ◽

Jiemin Liu ◽

Luyi Bai

Keyword(s):

Spatiotemporal Data ◽

Rdf Graphs

Download Full-text