linkage evaluation
Recently Published Documents


TOTAL DOCUMENTS

13
(FIVE YEARS 0)

H-INDEX

4
(FIVE YEARS 0)

Author(s):  
Nicole Pratt ◽  
Danielle Chun ◽  
Kourtney Davis ◽  
Brad Hammill ◽  
Christian Hampp ◽  
...  

IntroductionIncreasingly in pharmacoepidemiology, linking is required to enrich analytic data to more accurately define study populations, enable adjustment for confounding, and improve capture of health outcomes. When creating such novel linked datasets, researchers should consider their suitability to meet research objectives, assess source data completeness and population coverage, and ensure well-defined data governance standards and protections exist. Additionally, while the RECORD-PE guidelines assist in the reporting of studies using observational health data specific to pharmacoepidemiology, they do not address the unique requirements for transparent evaluation and reporting of the data linkage process. Objectives and ApproachWe aimed to 1) provide guidance on data linkage appropriateness and feasibility to plan purposeful and sustainable new linkages that advance pharmacoepidemiological research and 2) generate a checklist with specific recommendations to assist researchers in providing clear and transparent assessment of the linkage process. To develop these guidelines, a working group comprised of members of the International Society of harmacoepidemiology was formed. Recommendations were open for comment by Society members and endorsed by the Society. ResultsGuidance for feasibility assessment was categorized into five domains: (1) research objectives and justification; (2) data quality and completeness; (3) the linkage process; (4) data ownership and governance; and (5) overall value added by linkage. A checklist for evaluation and reporting of data-linkage processes covered five domains including; (1) data sources; (2) linkage variables; (3) linkage methods; (4) linkage results; and (5) linkage evaluation, including validation and verification of the resulting linked data. Conclusion/ImplicationsOur guidelines for data linkage feasibility assessment and reporting can be used to inform the design of sustainable linked data resources and for transparent communication of linkage processes. Together, these guidelines will help various stakeholders to critically assess the potential for bias in research based on linked data and help generate actionable evidence.


Author(s):  
Charini Nanayakkara ◽  
Peter Christen ◽  
Thilina Ranbaduge ◽  
Eilidh Garrett

Introduction The robustness of record linkage evaluation measures is of high importance since linkage techniques are assessed based on these. However, minimal research has been conducted to evaluate the suitability of existing evaluation measures in the context of linking groups of records. Linkage quality is generally evaluated based on traditional measures such as precision and recall. As we show, these traditional evaluation measures are not suitable for evaluating groups of linked records because they evaluate the quality of individual record pairs rather than the quality of records grouped into clusters. Objectives We highlight the shortcomings of traditional evaluation measures and then propose a novel method to evaluate clustering quality in the context of group-based record linkage. Methods The proposed linkage evaluation method assesses how well individual records have been allocated into predicted groups/clusters with respect to ground-truth data. We first identify the best representative predicted cluster for each ground-truth cluster and, based on the resulting mapping, each record in a ground-truth cluster is assigned to one of seven categories. These categories reflect how well the linkage technique assigned records into groups. Results We empirically evaluate our proposed method using real-world data and show that it better reflects the quality of clusters generated by three group-based record linkage techniques. We also show that traditional measures such as precision and recall can produce ambiguous results whereas our method does not. Conclusions The proposed evaluation method provides unambiguous results regarding the assessed group-based record linkage approaches. The method comprises of seven categories which reflect how each record was predicted, providing more detailed information about the quality of the linkage result. This will help to make better-informed decisions about which linkage technique is best suited for a given linkage application.


2019 ◽  
Vol 204 ◽  
pp. 275-284.e3 ◽  
Author(s):  
Ifrah Abdullahi ◽  
Kingsley Wong ◽  
Raewyn Mutch ◽  
Emma J. Glasson ◽  
Nicholas de Klerk ◽  
...  

Author(s):  
Tom Dalton ◽  
Graham Kirby ◽  
Alan Dearle ◽  
Özgür Akgün ◽  
Monique Mackenzie

Background’Gold-standard’ data to evaluate linkage algorithms are rare. Synthetic data have the advantage that all the true links are known. In the domain of population reconstruction, the ability to synthesize populations on demand, with varying characteristics, allows a linkage approach to be evaluated across a wide range of data. We have implemented ValiPop, a microsimulation model, for this purpose. ApproachValiPop can create many varied populations based upon sets of desired population statistics, thus allowing linkage algorithms to be evaluated across many populations, rather than across a limited number of real world ’gold-standard’ data sets. Given the potential interactions between different desired population statistics, the creation of a population does not necessarily imply that all desired population statistics have been met. To address this we have developed a statistical approach to validate the adherence of created populations to the desired statistics, using a generalized linear model. This talk will discuss the benefits of synthetic data for data linkage evaluation, the approach to validating created populations, and present the results of some initial linkage experiments using our synthetic data.


Sign in / Sign up

Export Citation Format

Share Document