Data quality assessment in hydrological information systems

2015 ◽  
Vol 17 (4) ◽  
pp. 640-661
Author(s):  
Li Chao ◽  
Zhou Hui ◽  
Zhou Xiaofeng

The hydrological data fed to hydrological decision support systems might be untimely, incomplete, inconsistent or illogical due to network congestion, low performance of servers, instrument failures, human errors, etc. It is imperative to assess, monitor and even control the quality of hydrological data residing in or acquired from each link of a hydrological data supply chain. However, the traditional quality management of hydrological data has focused mainly on intrinsic quality problems, such as outlier detection, nullity interpolation, consistency, completeness, etc., and could not be used to assess the quality of application – that is, consumed data in the form of data supply chain and with a granularity of tasks. To achieve these objectives, we first present a methodology to derive quality dimensions from hydrological information system by questionnaire and show the cognitive differences in quality dimension importance, then analyze the correlations between the tasks, classify them into five categories and construct the quality assessment model with time limits in the data supply chain. Exploratory experiments suggest the assessment system can provide data quality (DQ) indicators to DQ assessors, and enable authorized consumers to monitor and even control the quality of data used in an application with a granularity of tasks.

2017 ◽  
Vol 9 (1) ◽  
Author(s):  
Sophia Crossen

ObjectiveTo explore the quality of data submitted once a facility is movedinto an ongoing submission status and address the importance ofcontinuing data quality assessments.IntroductionOnce a facility meets data quality standards and is approved forproduction, an assumption is made that the quality of data receivedremains at the same level. When looking at production data qualityreports from various states generated using a SAS data qualityprogram, a need for production data quality assessment was identified.By implementing a periodic data quality update on all productionfacilities, data quality has improved for production data as a whole andfor individual facility data. Through this activity several root causesof data quality degradation have been identified, allowing processesto be implemented in order to mitigate impact on data quality.MethodsMany jurisdictions work with facilities during the onboardingprocess to improve data quality. Once a certain level of data qualityis achieved, the facility is moved into production. At this point thejurisdiction generally assumes that the quality of the data beingsubmitted will remain fairly constant. To check this assumption inKansas, a SAS Production Report program was developed specificallyto look at production data quality.A legacy data set is downloaded from BioSense production serversby Earliest Date in order to capture all records for visits which occurredwithin a specified time frame. This data set is then run through a SASdata quality program which checks specific fields for completenessand validity and prints a report on counts and percentages of null andinvalid values, outdated records, and timeliness of record submission,as well as examples of records from visits containing these errors.A report is created for the state as a whole, each facility, EHR vendor,and HIE sending data to the production servers, with examplesprovided only by facility. The facility, vendor, and HIE reportsinclude state percentages of errors for comparison.The Production Report was initially run on Kansas data for thefirst quarter of 2016 followed by consultations with facilities on thefindings. Monthly checks were made of data quality before and afterfacilities implemented changes. An examination of Kansas’ resultsshowed a marked decrease in data quality for many facilities. Everyfacility had at least one area in need of improvement.The data quality reports and examples were sent to every facilitysending production data during the first quarter attached to an emailrequesting a 30-60 minute call with each to go over the report. Thiscall was deemed crucial to the process since it had been over a year,and in a few cases over two years, since some of the facilities hadlooked at data quality and would need a review of the findings andall requirements, new and old. Ultimately, over half of all productionfacilities scheduled a follow-up call.While some facilities expressed some degree of trepidation, mostfacilities were open to revisiting data quality and to making requestedimprovements. Reasons for data quality degradation included updatesto EHR products, change of EHR product, work flow issues, engineupdates, new requirements, and personnel turnover.A request was made of other jurisdictions (including Arizona,Nevada, and Illinois) to look at their production data using the sameprogram and compare quality. Data was pulled for at least one weekof July 2016 by Earliest Date.ResultsMonthly reports have been run on Kansas Production data bothbefore and after the consultation meetings which indicate a markedimprovement in both completeness of required fields and validityof values in those fields. Data for these monthly reports was againselected by Earliest Date.ConclusionsIn order to ensure production data continues to be of value forsyndromic surveillance purposes, periodic data quality assessmentsshould continue after a facility reaches ongoing submission status.Alterations in process include a review of production data at leasttwice per year with a follow up data review one month later to confirmadjustments have been correctly implemented.


2017 ◽  
Vol 5 (1) ◽  
pp. 47-54
Author(s):  
Puguh Ika Listyorini ◽  
Mursid Raharjo ◽  
Farid Agushybana

Data are the basis to make a decision and policy. The quality of data is going to produce a better policy. The quality assessment methods nowadays do not include all indicators of data quality. If the indicators or assessment criteria in the quality assessment methods are more complete, the level of assessment methods of the data will be higher. The purpose of this study is to develop the method of independent assessment of routine data quality in Surakarta Health Department which is previously performed using the data quality assessment of PMKDR and HMN methods firstly.The design of this study is research and development (R&D) that has been modified into seven steps, namely formulating potential problems, collecting the data, designing the product, validating the design, fixing the design, testing the product, and fixing the product. The subjects consisted of 19 respondents who are managers of data in Surakarta Health Department. Data analysis method used is content analysis.The assessment results show that, in the pilot phase of the development of data quality assessment methods which have been developed, it is basically successful, or it can be used. The results of the assessment of the quality of the data by the developed method is the quality of data collection which is very adequate, the quality of data accuracy which is poor, the quality of data that consistency exists but is inadequate, the quality of the actuality of the data which is very adequate, the quality of periodicity data that is inadequate, the quality of the representation of the data that is very adequate, and sorting the data which is very adequate.It needs a commitment from Surakarta Health Department to take advantage of the development of these methods to assess the quality of data to support the availability of information, decision-making and planning of health programs. It also calls for the development of this research by conducting all stages of the steps of R&D so that the final result of the method development will be better.


2020 ◽  
Vol 10 (6) ◽  
pp. 2181
Author(s):  
Muhammad Aslam Jarwar ◽  
Ilyoung Chong

Due to the convergence of advanced technologies such as the Internet of Things, Artificial Intelligence, and Big Data, a healthcare platform accumulates data in a huge quantity from several heterogeneous sources. The adequate usage of this data may increase the impact of and improve the healthcare service quality; however, the quality of the data may be questionable. Assessing the quality of the data for the task in hand may reduce the associated risks, and increase the confidence of the data usability. To overcome the aforementioned challenges, this paper presents the web objects based contextual data quality assessment model with enhanced classification metric parameters. A semantic ontology of virtual objects, composite virtual objects, and services is also proposed for the parameterization of contextual data quality assessment of web objects data. The novelty of this article is the provision of contextual data quality assessment mechanisms at the data acquisition, assessment, and service level for the web objects enabled semantic data applications. To evaluate the proposed data quality assessment mechanism, web objects enabled affective stress and teens’ mood care semantic data applications are designed, and a deep data quality learning model is developed. The findings of the proposed approach reveal that, once a data quality assessment model is trained on web objects enabled healthcare semantic data, it could be used to classify the incoming data quality in various contextual data quality metric parameters. Moreover, the data quality assessment mechanism presented in this paper can be used to other application domains by incorporating data quality analysis requirements ontology.


2017 ◽  
Vol 4 (1) ◽  
pp. 25-31 ◽  
Author(s):  
Diana Effendi

Information Product Approach (IP Approach) is an information management approach. It can be used to manage product information and data quality analysis. IP-Map can be used by organizations to facilitate the management of knowledge in collecting, storing, maintaining, and using the data in an organized. The  process of data management of academic activities in X University has not yet used the IP approach. X University has not given attention to the management of information quality of its. During this time X University just concern to system applications used to support the automation of data management in the process of academic activities. IP-Map that made in this paper can be used as a basis for analyzing the quality of data and information. By the IP-MAP, X University is expected to know which parts of the process that need improvement in the quality of data and information management.   Index term: IP Approach, IP-Map, information quality, data quality. REFERENCES[1] H. Zhu, S. Madnick, Y. Lee, and R. Wang, “Data and Information Quality Research: Its Evolution and Future,” Working Paper, MIT, USA, 2012.[2] Lee, Yang W; at al, Journey To Data Quality, MIT Press: Cambridge, 2006.[3] L. Al-Hakim, Information Quality Management: Theory and Applications. Idea Group Inc (IGI), 2007.[4] “Access : A semiotic information quality framework: development and comparative analysis : Journal ofInformation Technology.” [Online]. Available: http://www.palgravejournals.com/jit/journal/v20/n2/full/2000038a.html. [Accessed: 18-Sep-2015].[5] Effendi, Diana, Pengukuran Dan Perbaikan Kualitas Data Dan Informasi Di Perguruan Tinggi MenggunakanCALDEA Dan EVAMECAL (Studi Kasus X University), Proceeding Seminar Nasional RESASTEK, 2012, pp.TIG.1-TI-G.6.


2021 ◽  
pp. 004912412199553
Author(s):  
Jan-Lucas Schanze

An increasing age of respondents and cognitive impairment are usual suspects for increasing difficulties in survey interviews and a decreasing data quality. This is why survey researchers tend to label residents in retirement and nursing homes as hard-to-interview and exclude them from most social surveys. In this article, I examine to what extent this label is justified and whether quality of data collected among residents in institutions for the elderly really differs from data collected within private households. For this purpose, I analyze the response behavior and quality indicators in three waves of Survey of Health, Ageing and Retirement in Europe. To control for confounding variables, I use propensity score matching to identify respondents in private households who share similar characteristics with institutionalized residents. My results confirm that most indicators of response behavior and data quality are worse in institutions compared to private households. However, when controlling for sociodemographic and health-related variables, differences get very small. These results suggest the importance of health for the data quality irrespective of the housing situation.


2015 ◽  
Vol 31 (2) ◽  
pp. 231-247 ◽  
Author(s):  
Matthias Schnetzer ◽  
Franz Astleithner ◽  
Predrag Cetkovic ◽  
Stefan Humer ◽  
Manuela Lenk ◽  
...  

Abstract This article contributes a framework for the quality assessment of imputations within a broader structure to evaluate the quality of register-based data. Four quality-related hyperdimensions examine the data processing from the raw-data level to the final statistics. Our focus lies on the quality assessment of different imputation steps and their influence on overall data quality. We suggest classification rates as a measure of accuracy of imputation and derive several computational approaches.


Author(s):  
Syed Mustafa Ali ◽  
Farah Naureen ◽  
Arif Noor ◽  
Maged Kamel N. Boulos ◽  
Javariya Aamir ◽  
...  

Background Increasingly, healthcare organizations are using technology for the efficient management of data. The aim of this study was to compare the data quality of digital records with the quality of the corresponding paper-based records by using data quality assessment framework. Methodology We conducted a desk review of paper-based and digital records over the study duration from April 2016 to July 2016 at six enrolled TB clinics. We input all data fields of the patient treatment (TB01) card into a spreadsheet-based template to undertake a field-to-field comparison of the shared fields between TB01 and digital data. Findings A total of 117 TB01 cards were prepared at six enrolled sites, whereas just 50% of the records (n=59; 59 out of 117 TB01 cards) were digitized. There were 1,239 comparable data fields, out of which 65% (n=803) were correctly matched between paper based and digital records. However, 35% of the data fields (n=436) had anomalies, either in paper-based records or in digital records. 1.9 data quality issues were calculated per digital patient record, whereas it was 2.1 issues per record for paper-based record. Based on the analysis of valid data quality issues, it was found that there were more data quality issues in paper-based records (n=123) than in digital records (n=110). Conclusion There were fewer data quality issues in digital records as compared to the corresponding paper-based records. Greater use of mobile data capture and continued use of the data quality assessment framework can deliver more meaningful information for decision making.


2021 ◽  
Vol 27 (3) ◽  
pp. 8-34
Author(s):  
Tatyana Cherkashina

The article presents the experience of converting non-targeted administrative data into research data, using as an example data on the income and property of deputies from local legislative bodies of the Russian Federation for 2019, collected as part of anticorruption operations. This particular empirical fragment was selected for the pilot study of administrative data, which includes assessing the possibility of integrating scattered fragments of information into a single database, assessing quality of data and their relevance for solving research problems, particularly analysis of high-income strata and the apparent trends towards individualization of private property. The system of indicators for assessing data quality includes their timeliness, availability, interpretability, reliability, comparability, coherence, errors of representation and measurement, and relevance. In the case of the data set in question, measurement errors are more common than representation errors. Overall the article emphasizes the notion that introducing new non-target data into circulation requires their preliminary testing, while data quality assessment becomes distributed both in time and between different subjects. The transition from created data to «obtained» data shifts the functions of evaluating its quality from the researcher-creator to the researcheruser. And though in this case data quality is in part ensured by the legal support for their production, the transformation of administrative data into research data involves assessing a variety of quality measurements — from availability to uniformity and accuracy.


Sign in / Sign up

Export Citation Format

Share Document