Link maintenance for integrity in linked open data evolution: Literature survey and open challenges

Semantic Web ◽

10.3233/sw-200398 ◽

2020 ◽

pp. 1-25

Author(s):

Andre Gomes Regino ◽

Julio Cesar dos Reis ◽

Rodrigo Bonacin ◽

Ahsan Morshed ◽

Timos Sellis

Keyword(s):

Semantic Web ◽

Linked Data ◽

Open Data ◽

Linked Open Data ◽

Evolutionary Characteristic ◽

Rdf Data ◽

Link Maintenance ◽

Data Elements ◽

Data Evolution ◽

Over Time

RDF data has been extensively deployed describing various types of resources in a structured way. Links between data elements described by RDF models stand for the core of Semantic Web. The rising amount of structured data published in public RDF repositories, also known as Linked Open Data, elucidates the success of the global and unified dataset proposed by the vision of the Semantic Web. Nowadays, semi-automatic algorithms build connections among these datasets by exploring a variety of methods. Interconnected open data demands automatic methods and tools to maintain their consistency over time. The update of linked data is considered as key process due to the evolutionary characteristic of such structured datasets. However, data changing operations might influence well-formed links, which turns difficult to maintain the consistencies of connections over time. In this article, we propose a thorough survey that provides a systematic review of the state of the art in link maintenance in linked open data evolution scenario. We conduct a detailed analysis of the literature for characterising and understanding methods and algorithms responsible for detecting, fixing and updating links between RDF data. Our investigation provides a categorisation of existing approaches as well as describes and discusses existing studies. The results reveal an absence of comprehensive solutions suited to fully detect, warn and automatically maintain the consistency of linked data over time.

Download Full-text

Improving Open Science Using Linked Open Data: CONICET Digital Use Case

Journal of Computer Science and Technology ◽

10.24215/16666038.19.e05 ◽

2019 ◽

Vol 19 (01) ◽

pp. e05

Author(s):

Marcos daniel Zarate ◽

Carlos Buckle ◽

Renato Mazzanti ◽

Gustavo Samec

Keyword(s):

Semantic Web ◽

Linked Data ◽

Open Data ◽

Scientific Publication ◽

Open Science ◽

Linked Open Data ◽

Scientific Publications ◽

Semantic Web Technologies ◽

Web Standards ◽

Methodological Guidelines

Scientific publication services are changing drastically, researchers demand intelligent search services to discover and relate scientific publications. Publishersneed to incorporate semantic information to better organize their digital assets and make publications more discoverable. In this paper, we present the on-going work to publish a subset of scientific publications of CONICET Digital as Linked Open Data. The objective of this work is to improve the recovery andreuse of data through Semantic Web technologies and Linked Data in the domain of scientific publications.To achieve these goals, Semantic Web standards and reference RDF schema’s have been taken into account (Dublin Core, FOAF, VoID, etc.). The conversion and publication process is guided by the methodological guidelines for publishing government linked data. We also outline how these data can be linked to other datasets DBLP, WIKIDATA and DBPEDIA on the web of data. Finally, we show some examples of queries that answer questions that initially CONICET Digital does not allow

Download Full-text

Europeana no Linked Open Data: conceitos de Web Semântica na dimensão aplicada das humanidades digitais

Pesquisa Brasileira em Ciência da Informação e Biblioteconomia ◽

10.22478/ufpb.1981-0695.2017v12n2.36529 ◽

2017 ◽

Vol 12 (2) ◽

Author(s):

Caio Saraiva Coneglian ◽

José Eduardo Santarem Segundo

Keyword(s):

Linked Data ◽

Open Data ◽

Linked Open Data

O surgimento de novas tecnologias, tem introduzido meios para a divulgação e a disponibilização das informações mais eficientemente. Uma iniciativa, chamada de Europeana, vem promovendo esta adaptação dos objetos informacionais dentro da Web, e mais especificamente no Linked Data. Desta forma, o presente estudo tem como objetivo apresentar uma discussão acerca da relação entre as Humanidades Digitais e o Linked Open Data, na figura da Europeana. Para tal, utilizamos uma metodologia exploratória e que busca explorar as questões relacionadas ao modelo de dados da Europeana, EDM, por meio do SPARQL. Como resultados, compreendemos as características do EDM, pela utilização do SPARQL. Identificamos, ainda, a importância que o conceito de Humanidades Digitais possui dentro do contexto da Europeana.Palavras-chave: Web semântica. Linked open data. Humanidades digitais. Europeana. EDM.Link: https://periodicos.ufsc.br/index.php/eb/article/view/1518-2924.2017v22n48p88/33031

Download Full-text

The read–write Linked Data Web

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2012.0513 ◽

2013 ◽

Vol 371 (1987) ◽

pp. 20120513 ◽

Cited By ~ 15

Author(s):

Tim Berners-Lee ◽

Kieron O’Hara

Keyword(s):

Future Development ◽

Linked Data ◽

Open Data ◽

Linked Open Data ◽

The Future ◽

The Web

This paper discusses issues that will affect the future development of the Web, either increasing its power and utility, or alternatively suppressing its development. It argues for the importance of the continued development of the Linked Data Web, and describes the use of linked open data as an important component of that. Second, the paper defends the Web as a read–write medium, and goes on to consider how the read–write Linked Data Web could be achieved.

Download Full-text

Ein Metadatenmodell für gemischte Sammlungen

Bibliotheksdienst ◽

10.1515/bd-2018-0066 ◽

2018 ◽

Vol 52 (7) ◽

pp. 548-564

Author(s):

Susanne Al-Eryani ◽

Gudrun Bucher ◽

Stefanie Rühle

Keyword(s):

Semantic Web ◽

Open Data ◽

Linked Open Data

Zusammenfassung Im Rahmen des DFG-geförderten Projekts „Entwicklung von interoperablen Standards für die Kontextualisierung heterogener Objekte am Beispiel der Provenienz Asch“ wurde ein Semantic Web und Linked Open Data fähiges Metadatenmodell entwickelt, das es ermöglicht, institutionsübergreifend Kulturerbe und dessen Provenienz zu kontextualisieren.

Download Full-text

The Pensoft Data Publishing Workflow: The FAIRway from articles to Linked Open Data

Biodiversity Information Science and Standards ◽

10.3897/biss.3.35902 ◽

2019 ◽

Vol 3 ◽

Author(s):

Lyubomir Penev ◽

Teodor Georgiev ◽

Viktor Senderov ◽

Mariya Dimitrova ◽

Pavel Stoev

Keyword(s):

Open Data ◽

Structured Data ◽

Linked Open Data ◽

Data Publishing ◽

Knowledge Graph ◽

Supplementary File ◽

Biodiversity Data ◽

Text Format ◽

Biodiversity Knowledge ◽

Data Elements

As one of the first advocates of open access and open data in the field of biodiversity publishiing, Pensoft has adopted a multiple data publishing model, resulting in the ARPHA-BioDiv toolbox (Penev et al. 2017). ARPHA-BioDiv consists of several data publishing workflows and tools described in the Strategies and Guidelines for Publishing of Biodiversity Data and elsewhere: Data underlying research results are deposited in an external repository and/or published as supplementary file(s) to the article and then linked/cited in the article text; supplementary files are published under their own DOIs and bear their own citation details. Data deposited in trusted repositories and/or supplementary files and described in data papers; data papers may be submitted in text format or converted into manuscripts from Ecological Metadata Language (EML) metadata. Integrated narrative and data publishing realised by the Biodiversity Data Journal, where structured data are imported into the article text from tables or via web services and downloaded/distributed from the published article. Data published in structured, semanticaly enriched, full-text XMLs, so that several data elements can thereafter easily be harvested by machines. Linked Open Data (LOD) extracted from literature, converted into interoperable RDF triples in accordance with the OpenBiodiv-O ontology (Senderov et al. 2018) and stored in the OpenBiodiv Biodiversity Knowledge Graph. Data underlying research results are deposited in an external repository and/or published as supplementary file(s) to the article and then linked/cited in the article text; supplementary files are published under their own DOIs and bear their own citation details. Data deposited in trusted repositories and/or supplementary files and described in data papers; data papers may be submitted in text format or converted into manuscripts from Ecological Metadata Language (EML) metadata. Integrated narrative and data publishing realised by the Biodiversity Data Journal, where structured data are imported into the article text from tables or via web services and downloaded/distributed from the published article. Data published in structured, semanticaly enriched, full-text XMLs, so that several data elements can thereafter easily be harvested by machines. Linked Open Data (LOD) extracted from literature, converted into interoperable RDF triples in accordance with the OpenBiodiv-O ontology (Senderov et al. 2018) and stored in the OpenBiodiv Biodiversity Knowledge Graph. The above mentioned approaches are supported by a whole ecosystem of additional workflows and tools, for example: (1) pre-publication data auditing, involving both human and machine data quality checks (workflow 2); (2) web-service integration with data repositories and data centres, such as Global Biodiversity Information Facility (GBIF), Barcode of Life Data Systems (BOLD), Integrated Digitized Biocollections (iDigBio), Data Observation Network for Earth (DataONE), Long Term Ecological Research (LTER), PlutoF, Dryad, and others (workflows 1,2); (3) semantic markup of the article texts in the TaxPub format facilitating further extraction, distribution and re-use of sub-article elements and data (workflows 3,4); (4) server-to-server import of specimen data from GBIF, BOLD, iDigBio and PlutoR into manuscript text (workflow 3); (5) automated conversion of EML metadata into data paper manuscripts (workflow 2); (6) export of Darwin Core Archive and automated deposition in GBIF (workflow 3); (7) submission of individual images and supplementary data under own DOIs to the Biodiversity Literature Repository, BLR (workflows 1-3); (8) conversion of key data elements from TaxPub articles and taxonomic treatments extracted by Plazi into RDF handled by OpenBiodiv (workflow 5). These approaches represent different aspects of the prospective scholarly publishing of biodiversity data, which in a combination with text and data mining (TDM) technologies for legacy literature (PDF) developed by Plazi, lay the ground of an entire data publishing ecosystem for biodiversity, supplying FAIR (Findable, Accessible, Interoperable and Reusable data to several interoperable overarching infrastructures, such as GBIF, BLR, Plazi TreatmentBank, OpenBiodiv and various end users.

Download Full-text

Awareness of Linked Open Data Among the Employees of Polish Libraries, Archives, and Museums

Zagadnienia Informacji Naukowej - Studia Informacyjne ◽

10.36702/zin.826 ◽

2022 ◽

Vol 59 (2(118)) ◽

pp. 7-25

Author(s):

Dorota Siwecka

Keyword(s):

Linked Data ◽

Open Data ◽

Linked Open Data ◽

Survey Method ◽

Doctorate Degree ◽

Research Libraries ◽

The People ◽

Central Statistical ◽

Central Statistical Office ◽

The Subject

Purpose/Thesis: This article presents the results of a survey conducted in January 2021 among employees of Polish libraries, museums, and archives, examining their awareness of open linked data technologies. The research had a pilot character and its results will be used to improve the questionnaire and to conduct research on a wider scale. Approach/Methods: The survey method was used in the study. Results and conclusions: On the basis of answers received, it can be concluded that open linked data is not yet very well-known among employees of Polish libraries, museums, and archives. Those most aware of technologies allowing for machine understanding of content shared on the Web are doctorate degree-holders employed in research libraries. Furthermore, awareness of the projects using LOD technologies does not correlate with awareness of these technological solutions. Research limitations: The number of respondents (415) constitutes 1% of all the people employed in libraries, archives, and museums in Poland (based on data provided by the Central Statistical Office of Poland). This is not a large number, but considering the variety among the respondents, the sample can be considered representative. Originality/Value: The awareness of Linked Open Data among employees of Polish libraries, archives, and museums has not been the subject of any study so far. In fact, this type of research has not been conducted in other countries either.

Download Full-text

SKAITMENINĖS HUMANITARIKOS IŠPLĖTIMO SEMANTINIAME ŽINIATINKLYJE GALIMYBĖS: KROATIJOS VIDURAMŽIŲ RANKRAŠČIŲ, INKUNABULŲ IR JŲ FRAGMENTŲ ATVEJO ANALIZĖ

Knygotyra ◽

10.15388/kn.v61i0.1954 ◽

2013 ◽

Vol 61 ◽

pp. 254-277

Author(s):

MARIJANA TOMIĆ ◽

MIRNA WILLER

Keyword(s):

Semantic Web ◽

Open Data ◽

Linked Open Data

Rankraščių rinkiniai – tai labai įvairaus pobūdžio rankraščiai, paprastai apibrėžiami kaip „ranka ant popieriaus arba pergamento užrašytas tekstas arba dokumentas“ (Peter Beal). Tai gali būti šeimos ar asmeniniai dokumentai, dienoraščiai, laiškai, archyvų rinkiniai ir kt. Viduramžių rankraščiai – kodeksai, žemėlapiai, muzikos kūriniai arba jų fragmentai – sudaro specialią rankraščių rūšį. Kaip ir inkunabulai, rankraščių rinkiniai yra vertingiausia bibliotekų paveldo dalis, dėl jų mus pasiekia itin daug informacijos apie viduramžių istoriją, kultūrą, literatūrą, socialinę istoriją, gyvenimo tendencijas. Be šių šaltinių informacija būtų dingusi. Senų ir retų rankraščių tyrimai svarbūs tiek šalies, tiek visos Europos kultūros ir socialinei istorijai. Žvelgiant iš humanitarinių mokslų perspektyvos, būtina išskirti keletą veiksnių, kurie lėmė reikšmingus pokyčius tyrinėjant rankraščius ir pirmąsias spausdintines knygas. Pačiu svarbiausiu laikomas informacinių technologijų poveikis beveik visoms tyrimo sritims. Šie pokyčiai lėmė ir naujos disciplinos – skaitmeninių humanitarinių mokslų atsiradimą. Pasak Toby’o Burrowso, viduramžių tyrinėtojai yra „pažangiausi skaitmeninių technologijų taikymo humanitarinių mokslų tyrimuose atstovai“. Vis dėlto T. Burrowsas išskiria ir keletą keblumų, susijusių su interneto ir skaitmeninės bibliotekos paslaugomis. Jis nurodo „integracijos ir sąveikos tarp daugybės skirtingų interneto svetainių stygių“ bei terminologijos nenuoseklumą taikant aprašomuosius standartus. Savo ruožtu tai sukelia probleminę situaciją, nes „tyrinėtojams visame pasaulyje kyla daug sunkumų rasti, naudotis ir dalytis žiniomis apie viduramžių rankraščių kolekcijas“. Visiškai pritariame T. Burrowso minčiai, kad šią problemą galima išspręsti sukuriant tarptautinę bendradarbiavimo infrastruktūrą, kuri leistų tvarkyti turinį ir tarpusavyje susijusias žinias. Mūsų nuomone, ši infrastuktūra gali būti įgyvendinta technologinėje semantinio žiniatinklio ir sujungtų atvirų duomenų (angl. Semantic Web and Linked Open Data) terpėje. Straipsnyje aptariami viduramžių rankraščių ir inkunabulų bei jų fragmentų tyrimai ir šių šaltinių aprašymas kaip skaitmeninių humanitarinių mokslų projekto dalis, taikant šią naują technologiją. Nagrinėjamas šios srities Kroatijos Zadaro universiteto Informacijos mokslų fakulteto vykdomas mokslinių tyrimų projektas. Projekto tikslas – atrinkti duomenų elementus, reikalingus tiksliam minėtų šaltinių aprašymui ir jų standartizavimui, naudojant senų ir retų knygų tyrinėtojų parengtas bibliografijos, kodikologijos, paleografijos bei tipografijos ontologijas.Straipsnyje pateikiamas ir trumpas technologinės semantinio tinklo infrastruktūros bei jo standartų įvadas. Detaliai aprašoma metodika, padedanti paskelbti pasirinktą žodyną kaip vieną iš metaduomenų registro paslaugų. Pateikiamas sujungtų atvirų duomenų paskelbimo pavyzdys – pristatatomas grafikas, vaizduojantis iš dalies rekonstruoto rankraščio fragmento aprašymą. Kadangi visos minėtos disciplinos naudoja savo žodynus ir ontologijas, straipsnio autorės siūlo orientuotis ne į vieno bendro žodyno naudojimą, o į atitinkamų terminų sąsajų projektavimą vadovaujantis SKOS taisyklėmis. Taip būtų kuriami būsimos tarptautinės bendradarbiavimo struktūros pagrindai.

Download Full-text

Publishing Statistical Data following the Linked Open Data Principles

Cases on Open-Linked Data and Semantic Web Applications ◽

10.4018/978-1-4666-2827-4.ch011 ◽

2013 ◽

pp. 199-226 ◽

Cited By ~ 5

Author(s):

Jose María Alvarez Rodríguez ◽

Jules Clement ◽

José Emilio Labra Gayo ◽

Hania Farhan ◽

Patricia Ordoñez de Pablos

Keyword(s):

Linked Data ◽

Statistical Data ◽

Open Data ◽

Linked Open Data ◽

Dimensional Measure ◽

The Web

This chapter introduces the promotion of statistical data to the Linked Open Data initiative in the context of the Web Index project. A framework for the publication of raw statistics and a method to convert them to Linked Data are also presented following the W3C standards RDF, SKOS, and OWL. This case study is focused on the Web Index project; launched by the Web Foundation, the Index is the first multi-dimensional measure of the growth, utility, and impact of the Web on people and nations. Finally, an evaluation of the advantages of using Linked Data to publish statistics is also presented in conjunction with a discussion and future steps sections.

Download Full-text

Enabling the Matchmaking of Organizations and Public Procurement Notices by Means of Linked Open Data

Cases on Open-Linked Data and Semantic Web Applications ◽

10.4018/978-1-4666-2827-4.ch006 ◽

2013 ◽

pp. 105-131 ◽

Cited By ~ 3

Author(s):

Jose María Alvarez Rodríguez ◽

José Emilio Labra Gayo ◽

Patricia Ordoñez de Pablos

Keyword(s):

Semantic Web ◽

Linked Data ◽

Public Procurement ◽

Open Data ◽

Specific Information ◽

Semantic Web Technologies ◽

Web Technologies ◽

Domain Specific ◽

The Status ◽

Financial Transactions

The aim of this chapter is to present a proposal and a case study to describe the information about organizations in a standard way using the Linked Data approach. Several models and ontologies have been provided in order to formalize the data, structure and behaviour of organizations. Nevertheless, these tries have not been fully accepted due to some factors: (1) missing pieces to define the status of the organization; (2) tangled parts to specify the structure (concepts and relations) between the elements of the organization; 3) lack of text properties, and other factors. These divergences imply a set of incomplete approaches to formalize data and information about organizations. Taking into account the current trends of applying semantic web technologies and linked data to formalize, aggregate, and share domain specific information, a new model for organizations taking advantage of these initiatives is required in order to overcome existing barriers and exploit the corporate information in a standard way. This work is especially relevant in some senses to: (1) unify existing models to provide a common specification; (2) apply semantic web technologies and the Linked Data approach; (3) provide access to the information via standard protocols, and (4) offer new services that can exploit this information to trace the evolution and behaviour of the organization over time. Finally, this work is interesting to improve the clarity and transparency of some scenarios in which organizations play a key role, like e-procurement, e-health, or financial transactions.

Download Full-text

Semantic Web Standards for Publishing and Integrating Open Data

Advances in Electronic Government, Digital Divide, and Regional Development - Handbook of Research on Advanced ICT Integration for Governance and Policy Modeling ◽

10.4018/978-1-4666-6236-0.ch003 ◽

2014 ◽

pp. 28-47

Author(s):

Axel Polleres ◽

Simon Steyskal

Keyword(s):

Semantic Web ◽

World Wide ◽

Open Data ◽

Standard Data ◽

Web Standards ◽

Web Of Data ◽

Structured Information ◽

Potential Risks ◽

Rdf Data ◽

The Web

The World Wide Web Consortium (W3C) as the main standardization body for Web standards has set a particular focus on publishing and integrating Open Data. In this chapter, the authors explain various standards from the W3C's Semantic Web activity and the—potential—role they play in the context of Open Data: RDF, as a standard data format for publishing and consuming structured information on the Web; the Linked Data principles for interlinking RDF data published across the Web and leveraging a Web of Data; RDFS and OWL to describe vocabularies used in RDF and for describing mappings between such vocabularies. The authors conclude with a review of current deployments of these standards on the Web, particularly within public Open Data initiatives, and discuss potential risks and challenges.

Download Full-text