An RDF Data Set Quality Assessment Mechanism for Decentralized                     Systems

Li Huang; Zhenzhen Liu; Fangfang Xu; Jinguang Gu

doi:10.1162/dint_a_00059

An RDF Data Set Quality Assessment Mechanism for Decentralized Systems

Data Intelligence ◽

10.1162/dint_a_00059 ◽

2020 ◽

Vol 2 (4) ◽

pp. 529-553

Author(s):

Li Huang ◽

Zhenzhen Liu ◽

Fangfang Xu ◽

Jinguang Gu

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Linked Data ◽

Cost Effective ◽

Assessment Model ◽

Maintenance Cost ◽

Data Set ◽

Data Quality Assessment ◽

Rdf Data ◽

Assessment Results

With the rapid growth of the linked data on the Web, the quality assessment of the RDF data set becomes particularly important, especially for the quality and accessibility of the linked data. In most cases, RDF data sets are shared online, leading to a high maintenance cost for the quality assessment. This also potentially pollutes Internet data. Recently blockchain technology has shown the potential in many applications. Using the blockchain storage quality assessment results can reduce the centralization of the authority, and the quality assessment results have characteristics such as non-tampering. To this end, we propose an RDF data quality assessment model in a decentralized environment, pointing out a new dimension of RDF data quality. We use the blockchain to record the data quality assessment results and design a detailed update strategy for the quality assessment results. We have implemented a system DCQA to test and verify the feasibility of the quality assessment model. The proposed method can provide users with better cost-effective results when knowledge is independently protected.

Download Full-text

Reporting Data Quality Assessment Results: Identifying Individual and Organizational Barriers and Solutions

eGEMs (Generating Evidence & Methods to improve patient outcomes) ◽

10.5334/egems.214 ◽

2017 ◽

Vol 5 (1) ◽

pp. 16 ◽

Cited By ~ 3

Author(s):

Tiffany J. Callahan ◽

Juliana G. Barnard ◽

Laura J. Helmkamp ◽

Julie A. Maertens ◽

Michael G. Kahn

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Organizational Barriers ◽

Data Quality Assessment ◽

Reporting Data ◽

Assessment Results

Download Full-text

Data Quality Assessment And Calibration Of The Maestro-1 Freiburg Data Set

[Proceedings] IGARSS'91 Remote Sensing: Global Monitoring for Earth Management ◽

10.1109/igarss.1991.579146 ◽

2005 ◽

Cited By ~ 1

Author(s):

G.F. de Grandi ◽

H. de Groof ◽

C. Lavalle ◽

G.G. Lemoine ◽

A.J. Sieber

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Data Set ◽

Data Quality Assessment

Download Full-text

Luzzu—A Methodology and Framework for Linked Data Quality Assessment

Journal of Data and Information Quality ◽

10.1145/2992786 ◽

2016 ◽

Vol 8 (1) ◽

pp. 1-32 ◽

Cited By ~ 22

Author(s):

Jeremy Debattista ◽

SÖren Auer ◽

Christoph Lange

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Linked Data ◽

Data Quality Assessment

Download Full-text

Importance of Continued Data Quality Assessment of Syndromic Production Data

Online Journal of Public Health Informatics ◽

10.5210/ojphi.v9i1.7619 ◽

2017 ◽

Vol 9 (1) ◽

Author(s):

Sophia Crossen

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Production Data ◽

Quality Of Data ◽

Data Set ◽

Quality Degradation ◽

Periodic Data ◽

Data Quality Assessment

ObjectiveTo explore the quality of data submitted once a facility is movedinto an ongoing submission status and address the importance ofcontinuing data quality assessments.IntroductionOnce a facility meets data quality standards and is approved forproduction, an assumption is made that the quality of data receivedremains at the same level. When looking at production data qualityreports from various states generated using a SAS data qualityprogram, a need for production data quality assessment was identified.By implementing a periodic data quality update on all productionfacilities, data quality has improved for production data as a whole andfor individual facility data. Through this activity several root causesof data quality degradation have been identified, allowing processesto be implemented in order to mitigate impact on data quality.MethodsMany jurisdictions work with facilities during the onboardingprocess to improve data quality. Once a certain level of data qualityis achieved, the facility is moved into production. At this point thejurisdiction generally assumes that the quality of the data beingsubmitted will remain fairly constant. To check this assumption inKansas, a SAS Production Report program was developed specificallyto look at production data quality.A legacy data set is downloaded from BioSense production serversby Earliest Date in order to capture all records for visits which occurredwithin a specified time frame. This data set is then run through a SASdata quality program which checks specific fields for completenessand validity and prints a report on counts and percentages of null andinvalid values, outdated records, and timeliness of record submission,as well as examples of records from visits containing these errors.A report is created for the state as a whole, each facility, EHR vendor,and HIE sending data to the production servers, with examplesprovided only by facility. The facility, vendor, and HIE reportsinclude state percentages of errors for comparison.The Production Report was initially run on Kansas data for thefirst quarter of 2016 followed by consultations with facilities on thefindings. Monthly checks were made of data quality before and afterfacilities implemented changes. An examination of Kansas’ resultsshowed a marked decrease in data quality for many facilities. Everyfacility had at least one area in need of improvement.The data quality reports and examples were sent to every facilitysending production data during the first quarter attached to an emailrequesting a 30-60 minute call with each to go over the report. Thiscall was deemed crucial to the process since it had been over a year,and in a few cases over two years, since some of the facilities hadlooked at data quality and would need a review of the findings andall requirements, new and old. Ultimately, over half of all productionfacilities scheduled a follow-up call.While some facilities expressed some degree of trepidation, mostfacilities were open to revisiting data quality and to making requestedimprovements. Reasons for data quality degradation included updatesto EHR products, change of EHR product, work flow issues, engineupdates, new requirements, and personnel turnover.A request was made of other jurisdictions (including Arizona,Nevada, and Illinois) to look at their production data using the sameprogram and compare quality. Data was pulled for at least one weekof July 2016 by Earliest Date.ResultsMonthly reports have been run on Kansas Production data bothbefore and after the consultation meetings which indicate a markedimprovement in both completeness of required fields and validityof values in those fields. Data for these monthly reports was againselected by Earliest Date.ConclusionsIn order to ensure production data continues to be of value forsyndromic surveillance purposes, periodic data quality assessmentsshould continue after a facility reaches ongoing submission status.Alterations in process include a review of production data at leasttwice per year with a follow up data review one month later to confirmadjustments have been correctly implemented.

Download Full-text

Luzzu -- A Framework for Linked Data Quality Assessment

2016 IEEE Tenth International Conference on Semantic Computing (ICSC) ◽

10.1109/icsc.2016.48 ◽

2016 ◽

Cited By ~ 14

Author(s):

Jeremy Debattista ◽

Soren Auer ◽

Christoph Lange

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Linked Data ◽

Data Quality Assessment

Download Full-text

Crowdsourcing Linked Data Quality Assessment

Advanced Information Systems Engineering - Lecture Notes in Computer Science ◽

10.1007/978-3-642-41338-4_17 ◽

2013 ◽

pp. 260-276 ◽

Cited By ~ 62

Author(s):

Maribel Acosta ◽

Amrapali Zaveri ◽

Elena Simperl ◽

Dimitris Kontokostas ◽

Sören Auer ◽

...

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Linked Data ◽

Data Quality Assessment

Download Full-text

A Data Quality Assessment Model and Its Application to Cybersecurity Data Sources

13th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2020) - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-57805-3_25 ◽

2020 ◽

pp. 263-272

Author(s):

Noemí DeCastro-García ◽

Enrique Pinto

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Assessment Model ◽

Data Sources ◽

Data Quality Assessment

Download Full-text

Medical Data Quality Assessment Model Based on Credibility Analysis

2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC) ◽

10.1109/itoec.2018.8740576 ◽

2018 ◽

Cited By ~ 2

Author(s):

Songting Zan ◽

Xu Zhang

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Medical Data ◽

Assessment Model ◽

Model Based ◽

Data Quality Assessment

Download Full-text

A Linked Data Quality Assessment Framework for Network Data

Proceedings of the 2nd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) - GRADES-NDA'19 ◽

10.1145/3327964.3328493 ◽

2019 ◽

Cited By ~ 1

Author(s):

Alex To ◽

Rouzbeh Meymandpour ◽

Joseph G. Davis ◽

Guillaume Jourjon ◽

Jonathan Chan

Keyword(s):

Data Quality ◽

Quality Assessment ◽

Linked Data ◽

Network Data ◽

Assessment Framework ◽

Data Quality Assessment

Download Full-text

Big Data Quality Assessment Model for Unstructured Data

2018 International Conference on Innovations in Information Technology (IIT) ◽

10.1109/innovations.2018.8605945 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ikbal Taleb ◽

Mohamed Adel Serhani ◽

Rachida Dssouli

Keyword(s):

Big Data ◽

Data Quality ◽

Quality Assessment ◽

Assessment Model ◽

Unstructured Data ◽

Data Quality Assessment

Download Full-text