Domain-specific Common Data Elements for Rare Disease Registration: A Conceptual Approach of a European Joint Initiative towards Semantic Interoperability in Rare Disease Research (Preprint)

Semantic modelling of Common Data Elements for Rare Disease registries, and a prototype workflow for their deployment over registry data

10.1101/2021.07.27.21261169 ◽

2021 ◽

Author(s):

Rajaram Kaliyaperumal ◽

Mark D Wilkinson ◽

Pablo Alarcon Moreno ◽

Nirupama Benis ◽

Ronald Cornet ◽

...

Keyword(s):

Rare Disease ◽

Data Repositories ◽

Common Data Elements ◽

Human Phenotype ◽

Domain Experts ◽

Semantic Modelling ◽

Disease Registries ◽

Data Elements ◽

The Eu ◽

Core Framework

Background: The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Disease (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. Results: Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. Conclusions: Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them.

Download Full-text

A methodology for a minimum data set for rare diseases to support national centers of excellence for healthcare and research

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2014-002794 ◽

2014 ◽

Vol 22 (1) ◽

pp. 76-85 ◽

Cited By ~ 19

Author(s):

Rémy Choquet ◽

Meriem Maaroufi ◽

Albane de Carrara ◽

Claude Messiaen ◽

Emmanuel Luigi ◽

...

Keyword(s):

Data Collection ◽

Rare Disease ◽

Rare Diseases ◽

Care Coordination ◽

Minimum Data Set ◽

National Plan ◽

Data Set ◽

Common Data Elements ◽

Minimum Data ◽

Data Elements

Abstract Background Although rare disease patients make up approximately 6–8% of all patients in Europe, it is often difficult to find the necessary expertise for diagnosis and care and the patient numbers needed for rare disease research. The second French National Plan for Rare Diseases highlighted the necessity for better care coordination and epidemiology for rare diseases. A clinical data standard for normalization and exchange of rare disease patient data was proposed. The original methodology used to build the French national minimum data set (F-MDS-RD) common to the 131 expert rare disease centers is presented. Methods To encourage consensus at a national level for homogeneous data collection at the point of care for rare disease patients, we first identified four national expert groups. We reviewed the scientific literature for rare disease common data elements (CDEs) in order to build the first version of the F-MDS-RD. The French rare disease expert centers validated the data elements (DEs). The resulting F-MDS-RD was reviewed and approved by the National Plan Strategic Committee. It was then represented in an HL7 electronic format to maximize interoperability with electronic health records. Results The F-MDS-RD is composed of 58 DEs in six categories: patient, family history, encounter, condition, medication, and questionnaire. It is HL7 compatible and can use various ontologies for diagnosis or sign encoding. The F-MDS-RD was aligned with other CDE initiatives for rare diseases, thus facilitating potential interconnections between rare disease registries. Conclusions The French F-MDS-RD was defined through national consensus. It can foster better care coordination and facilitate determining rare disease patients’ eligibility for research studies, trials, or cohorts. Since other countries will need to develop their own standards for rare disease data collection, they might benefit from the methods presented here.

Download Full-text

Importance of common data elements (CDE) for rare disease clinical trials

Molecular Genetics and Metabolism ◽

10.1016/j.ymgme.2015.12.436 ◽

2016 ◽

Vol 117 (2) ◽

pp. S105

Author(s):

Elsa G. Shapiro ◽

Kathleen A. Delaney

Keyword(s):

Clinical Trials ◽

Rare Disease ◽

Common Data Elements ◽

Data Elements

Download Full-text

The EPIRARE proposal of a set of indicators and common data elements for the European platform for rare disease registration

Archives of Public Health ◽

10.1186/2049-3258-72-35 ◽

2014 ◽

Vol 72 (1) ◽

Cited By ~ 16

Author(s):

Domenica Taruscio ◽

Emanuela Mollo ◽

Sabina Gainotti ◽

Manuel Posada de la Paz ◽

Fabrizio Bianchi ◽

...

Keyword(s):

Rare Disease ◽

Common Data Elements ◽

Data Elements

Download Full-text

Technical advance articles Composite CDE: modeling composite relationships between common data elements for representing complex clinical data

10.21203/rs.2.11646/v2 ◽

2020 ◽

Author(s):

Hye Hyeon Kim ◽

Yu Rang Park ◽

Ju Han Kim

Keyword(s):

Clinical Data ◽

Medical Information ◽

Bulk Sample ◽

Semantic Interoperability ◽

Teaching Hospitals ◽

Complex Data ◽

Common Data Elements ◽

Sample Data ◽

Semantic Types ◽

Data Elements

Abstract Background: Semantic interoperability is essential for improving data quality and sharing. The ISO/IEC 11179 Metadata Registry (MDR) standard has been highlighted as a solution for standardizing and registering clinical data elements (DEs). However, the standard model has both structural and semantic limitations, and the number of DEs continues to increase due to poor term reusability. Semantic types and constraints are lacking for comprehensively describing and evaluating DEs on real-world clinical documents. Methods: We addressed these limitations by defining three new types of semantic relationship ( dependency , composite , and variable ) in our previous studies. The present study created new and further extended existing semantic types ( hybrid atomic and repeated and dictionary composite common data elements [CDEs]) with four constraints: ordered , operated , required , and dependent . For evaluation, we extracted all atomic and composite CDEs from five major clinical documents from five teaching hospitals in Korea, 14 Fast Healthcare Interoperability Resources (FHIR) resources from FHIR bulk sample data, and MIMIC-III (Medical Information Mart for Intensive Care) demo dataset. Metadata reusability and semantic interoperability in real clinical settings were comprehensively evaluated by applying the CDEs with our extended semantic types and constraints. Results: All of the CDEs ( n =1142) extracted from the 25 clinical documents were successfully integrated with a very high CDE reuse ratio (46.9%) into 586 CDEs (259 atomic and 20 unique composite CDEs), and all of CDEs (n=238) extracted from the 14 FHIR resources of FHIR bulk sample data were successfully integrated with high CDE reuse ration (59.7%) into 96 CDEs (21 atomic and 28 unique composite CDEs), which improved the semantic integrity and interoperability without any semantic loss. Moreover, the most complex data structures from two CDE projects were successfully encoded with rich semantics and semantic integrity. Conclusion: MDR-based extended semantic types and constraints can facilitate comprehensive representation of clinical documents with rich semantics, and improved semantic interoperability without semantic loss.

Download Full-text

Technical advance articles Composite CDE: modeling composite relationships between common data elements for representing complex clinical data

10.21203/rs.2.11646/v1 ◽

2019 ◽

Author(s):

Hye Hyeon Kim ◽

Yu Rang Park ◽

Ju Han Kim

Keyword(s):

Clinical Data ◽

Semantic Interoperability ◽

Teaching Hospitals ◽

Clinical Settings ◽

Complex Data ◽

Common Data Elements ◽

The Standard Model ◽

Semantic Types ◽

Data Elements ◽

Very High

Abstract Background Semantic interoperability is essential for improving data quality and sharing. The ISO/IEC 11179 Metadata Registry (MDR) standard has been highlighted as a solution for standardizing and registering clinical data elements (DEs). However, the standard model has both structural and semantic limitations, and the number of DEs continues to increase due to poor term reusability. Semantic types and constraints are lacking for comprehensively describing and evaluating DEs on real-world clinical documents. Methods We addressed these limitations by defining three new types of semantic relationship (dependency, composite, and variable) in our previous studies. The present study further extended semantic types (hybrid atomic and repeated and dictionary composite common data elements [CDEs]) with four constraints: ordered, operated, required, and dependent. For evaluation, we extracted all atomic and composite CDEs from five major clinical documents from five teaching hospitals in Korea. Metadata reusability and semantic interoperability in real clinical settings were comprehensively evaluated by applying the CDEs with our extended semantic types and constraints. Results All of the CDEs (n=1142) extracted from the 25 clinical documents were successfully integrated with a very high CDE reuse ratio (46.9%) into 606 CDEs (259 atomic and 20 unique composite CDEs), which improved the semantic integrity and interoperability without any semantic loss. Moreover, the most complex data structures from two CDE projects were successfully encoded with rich semantics and semantic integrity. Conclusion MDR-based extended semantic types and constraints can facilitate comprehensive representation of clinical documents with rich semantics and improved semantic interoperability without semantic loss.

Download Full-text

TBI surveillance using the common data elements for traumatic brain injury: a population study

International Journal of Emergency Medicine ◽

10.1186/1865-1380-6-5 ◽

2013 ◽

Vol 6 (1) ◽

Cited By ~ 11

Author(s):

Latha Ganti Stead ◽

◽

Aakash N Bodhit ◽

Pratik Shashikant Patel ◽

Yasamin Daneshvar ◽

...

Keyword(s):

Traumatic Brain Injury ◽

Brain Injury ◽

Population Study ◽

Common Data Elements ◽

The Common ◽

Data Elements

Download Full-text

The development and deployment of Common Data Elements for tissue banks for translational research in cancer – An emerging standard based approach for the Mesothelioma Virtual Tissue Bank

BMC Cancer ◽

10.1186/1471-2407-8-91 ◽

2008 ◽

Vol 8 (1) ◽

Cited By ~ 27

Author(s):

Sambit K Mohanty ◽

Amita T Mistry ◽

Waqas Amin ◽

Anil V Parwani ◽

Andrew K Pople ◽

...

Keyword(s):

Translational Research ◽

Tissue Bank ◽

Tissue Banks ◽

Common Data Elements ◽

Data Elements

Download Full-text

The involvement of patient organisations in rare disease research: a mixed methods study in Australia

Orphanet Journal of Rare Diseases ◽

10.1186/s13023-016-0382-6 ◽

2016 ◽

Vol 11 (1) ◽

Cited By ~ 21

Author(s):

Deirdre Pinto ◽

Dominique Martin ◽

Richard Chenhall

Keyword(s):

Mixed Methods ◽

Rare Disease ◽

Mixed Methods Study ◽

Disease Research ◽

Patient Organisations

Download Full-text

Abstract TP374: Data Linkage is Effective for Improving the Available Data for Stroke: An Example from the Australian Stroke Clinical Registry.

Stroke ◽

10.1161/str.44.suppl_1.atp374 ◽

2013 ◽

Vol 44 (suppl_1) ◽

Author(s):

Monique F Kilkenny ◽

Helen M Dewey ◽

Natasha A Lannin ◽

Vijaya Sundararajan ◽

Joyce Lim ◽

...

Keyword(s):

Hospital Discharge ◽

Hospital Discharge Data ◽

Hospital Data ◽

High Quality ◽

Clinical Registry ◽

Large Hospital ◽

Common Data Elements ◽

Discharge Data ◽

Data Elements ◽

Data Collections

Introduction: Multiple data collections can be a burden for clinicians. In 2009, the Australian Stroke Clinical Registry (AuSCR) was established by non-government and research organizations to provide quality of care data unavailable for acute stroke admissions. We show here the reliability of linking complimentary registry data with routinely collected hospital discharge data submitted to governmental bodies. Hypothesis: A high quality linkage with a > 90% rate is possible, but requires multiple personal identifiers common to each dataset. Methods: AuSCR identifying variables included date of birth (DoB), Medicare number, first name, surname, postcode, gender, hospital record number, hospital name and admission date. The Victorian Department of Health emergency department (ED) and hospital discharge linked dataset has most of these, with first name truncated to the first 3 digits, but no surname. Common data elements of AuSCR patients registered at a large hospital in Melbourne, Victoria (Australia) between 15 June 2009 and 31 December 2010 were submitted to undergo stepwise deterministic linkage. Results: The Victorian AuSCR sample had 818 records from 788 individuals. Three steps with 1) Medicare number, postcode, gender and DoB (80% matched); 2) hospital number/admit date; and 3) ED number/visit date were required to link AuSCR data with the ED and hospital discharge data. These led to an overall high quality linkage of >99% (782/788) of AuSCR patients, including 731/788 for ED records and 736/788 for hospital records. Conclusion: Multiple personal identifiers from registries are required to achieve reliable linkage to routinely collected hospital data. Benefits of these linked data include the ability to investigate a broader range of research questions than with a single dataset. Characters with spaces= 1941 (limit is 1950)

Download Full-text