scholarly journals Tank 241-C-106 sampling data requirements developed through the data quality objectives (DQO) process

1994 ◽  
Author(s):  
O.S. Wang ◽  
K.E. Bell ◽  
C.M. Anderson ◽  
M.S. Peffers ◽  
B.A. Pulsipher ◽  
...  
Author(s):  
Marcel von Lucadou ◽  
Thomas Ganslandt ◽  
Hans-Ulrich Prokosch ◽  
Dennis Toddenroth

Abstract Background The secondary use of electronic health records (EHRs) promises to facilitate medical research. We reviewed general data requirements in observational studies and analyzed the feasibility of conducting observational studies with structured EHR data, in particular diagnosis and procedure codes. Methods After reviewing published observational studies from the University Hospital of Erlangen for general data requirements, we identified three different study populations for the feasibility analysis with eligibility criteria from three exemplary observational studies. For each study population, we evaluated the availability of relevant patient characteristics in our EHR, including outcome and exposure variables. To assess data quality, we computed distributions of relevant patient characteristics from the available structured EHR data and compared them to those of the original studies. We implemented computed phenotypes for patient characteristics where necessary. In random samples, we evaluated how well structured patient characteristics agreed with a gold standard from manually interpreted free texts. We categorized our findings using the four data quality dimensions “completeness”, “correctness”, “currency” and “granularity”. Results Reviewing general data requirements, we found that some investigators supplement routine data with questionnaires, interviews and follow-up examinations. We included 847 subjects in the feasibility analysis (Study 1 n = 411, Study 2 n = 423, Study 3 n = 13). All eligibility criteria from two studies were available in structured data, while one study required computed phenotypes in eligibility criteria. In one study, we found that all necessary patient characteristics were documented at least once in either structured or unstructured data. In another study, all exposure and outcome variables were available in structured data, while in the other one unstructured data had to be consulted. The comparison of patient characteristics distributions, as computed from structured data, with those from the original study yielded similar distributions as well as indications of underreporting. We observed violations in all four data quality dimensions. Conclusions While we found relevant patient characteristics available in structured EHR data, data quality problems may entail that it remains a case-by-case decision whether diagnosis and procedure codes are sufficient to underpin observational studies. Free-text data or subsequently supplementary study data may be important to complement a comprehensive patient history.


1993 ◽  
Author(s):  
J.W. Buck ◽  
C.M. Anderson ◽  
B.A. Pulsipher ◽  
J.J. Toth ◽  
P.J. Turner ◽  
...  

1994 ◽  
Author(s):  
J.E. Meacham ◽  
R.J. Cash ◽  
G.T. Dukelow ◽  
H. Babad ◽  
J.W. Buck ◽  
...  

Author(s):  
Ms. Latha S S ◽  
Pavan Kumar S

Data required for a new application are frequently come from other existing application systems. If data required for the new application are available from existing systems and the volume of data is large, the necessary data should be migrated from the existing systems (source systems) to the new application (target system) instead of recreating those data for the target system. The Transformation of data is generally a necessary step in data migration because the data requirements and the architecture of the target system are different from that of the source systems. This paper surveys the data migration techniques which focus on improving the data quality between different types of databases.


Author(s):  
Hamid Naceur Benkhaled ◽  
Djamel Berrabah ◽  
Faouzi Boufares

Before the arrival of the Big Data era, data warehouse (DW) systems were considered the best decision support systems (DSS). DW systems have always helped organizations around the world to analyse their stored data and use it in making decisive decisions. However, analyzing and mining data of poor quality can give the wrong conclusions. Several data quality (DQ) problems can appear during a data warehouse project like missing values, duplicates values, integrity constrains issues and more. As a result, organizations around the world are more aware of the importance of data quality and invest a lot of money in order to manage data quality in the DW systems. On the other hand, with the arrival of BD, new challenges have to be considered like the need for collecting the most recent data and the ability to make real-time decisions. This article provides a survey about the exiting techniques to control the quality of the stored data in the DW systems and the new solutions proposed in the literature to face the new Big Data requirements.


2012 ◽  
Author(s):  
Nurul A. Emran ◽  
Noraswaliza Abdullah ◽  
Nuzaimah Mustafa

1999 ◽  
Vol 38 (04/05) ◽  
pp. 339-344 ◽  
Author(s):  
J. van der Lei ◽  
B. M. Th. Mosseveld ◽  
M. A. M. van Wijk ◽  
P. D. van der Linden ◽  
M. C. J. M. Sturkenboom ◽  
...  

AbstractResearchers claim that data in electronic patient records can be used for a variety of purposes including individual patient care, management, and resource planning for scientific research. Our objective in the project Integrated Primary Care Information (IPCI) was to assess whether the electronic patient records of Dutch general practitioners contain sufficient data to perform studies in the area of postmarketing surveillance studies. We determined the data requirements for postmarketing surveil-lance studies, implemented additional software in the electronic patient records of the general practitioner, developed an organization to monitor the use of data, and performed validation studies to test the quality of the data. Analysis of the data requirements showed that additional software had to be installed to collect data that is not recorded in routine practice. To avoid having to obtain informed consent from each enrolled patient, we developed IPCI as a semianonymous system: both patients and participating general practitioners are anonymous for the researchers. Under specific circumstances, the researcher can contact indirectly (through a trusted third party) the physician that made the data available. Only the treating general practitioner is able to decode the identity of his patients. A Board of Supervisors predominantly consisting of participating general practitioners monitors the use of data. Validation studies show the data can be used for postmarketing surveillance. With additional software to collect data not normally recorded in routine practice, data from electronic patient record of general practitioners can be used for postmarketing surveillance.


Sign in / Sign up

Export Citation Format

Share Document