scholarly journals A CF data model and implementation

2017 ◽  
Author(s):  
David Hassell ◽  
Jonathan Gregory ◽  
Jon Blower ◽  
Bryan N. Lawrence ◽  
Karl E. Taylor

Abstract. The CF (Climate and Forecast) metadata conventions are designed to promote the creation, processing and sharing of climate and forecasting data using Network Common Data Form (netCDF) files and libraries. The CF conventions provide a description of the physical meaning of data and of their spatial and temporal properties, but they depend on the netCDF file encoding which can currently only be fully understood and interpreted by someone familiar with the rules and relationships specified in the conventions documentation. To aid in development of CF-compliant software and to capture with a minimal set of elements all of the information contained in the CF conventions, we propose a formal data model for CF which is independent of netCDF and describes all possible CF-compliant data. Because such data will often be analysed and visualised using software based on other data models, we compare the CF data model with the ISO 19123 coverage model, the Open Geospatial Consortium CF netCDF standard and the Unidata Common Data Model. To demonstrate that the CF data model can in fact be implemented, we present cf-python, a Python software library that conforms to the model and can manipulate any CF-compliant dataset.

2017 ◽  
Vol 10 (12) ◽  
pp. 4619-4646 ◽  
Author(s):  
David Hassell ◽  
Jonathan Gregory ◽  
Jon Blower ◽  
Bryan N. Lawrence ◽  
Karl E. Taylor

Abstract. The CF (Climate and Forecast) metadata conventions are designed to promote the creation, processing, and sharing of climate and forecasting data using Network Common Data Form (netCDF) files and libraries. The CF conventions provide a description of the physical meaning of data and of their spatial and temporal properties, but they depend on the netCDF file encoding which can currently only be fully understood and interpreted by someone familiar with the rules and relationships specified in the conventions documentation. To aid in development of CF-compliant software and to capture with a minimal set of elements all of the information contained in the CF conventions, we propose a formal data model for CF which is independent of netCDF and describes all possible CF-compliant data. Because such data will often be analysed and visualised using software based on other data models, we compare our CF data model with the ISO 19123 coverage model, the Open Geospatial Consortium CF netCDF standard, and the Unidata Common Data Model. To demonstrate that this CF data model can in fact be implemented, we present cf-python, a Python software library that conforms to the model and can manipulate any CF-compliant dataset.


2020 ◽  
Author(s):  
Stephany N Duda ◽  
Beverly S Musick ◽  
Mary-Ann Davies ◽  
Annette H Sohn ◽  
Bruno Ledergerber ◽  
...  

Objective To describe content domains and applications of the IeDEA Data Exchange Standard, its development history, governance structure, and relationships to other established data models, as well as to share open source, reusable, scalable, and adaptable implementation tools with the informatics community. Methods In 2012, the International Epidemiology Databases to Evaluate AIDS (IeDEA) collaboration began development of a data exchange standard, the IeDEA DES, to support collaborative global HIV epidemiology research. With the HIV Cohorts Data Exchange Protocol as a template, a global group of data managers, statisticians, clinicians, informaticians, and epidemiologists reviewed existing data schemas and clinic data procedures to develop the HIV data exchange model. The model received a substantial update in 2017, with annual updates thereafter. Findings The resulting IeDEA DES is a patient-centric common data model designed for HIV research that has been informed by established data models from US-based electronic health records, broad experience in data collection in resource-limited settings, and informatics best practices. The IeDEA DES is inherently flexible and continues to grow based on the ongoing stewardship of the IeDEA Data Harmonization Working Group with input from external collaborators. Use of the IeDEA DES has improved multiregional collaboration within and beyond IeDEA, expediting over 95 multiregional research projects using data from more than 400 HIV care and treatment sites across seven global regions. A detailed data model specification and REDCap data entry templates that implement the IeDEA DES are publicly available on GitHub. Conclusions The IeDEA common data model and related resources are powerful tools to foster collaboration and accelerate science across research networks. While currently directed towards observational HIV research and data from resource-limited settings, this model is flexible and extendable to other areas of health research.


10.2196/15199 ◽  
2019 ◽  
Vol 7 (4) ◽  
pp. e15199
Author(s):  
Emily Rose Pfaff ◽  
James Champion ◽  
Robert Louis Bradford ◽  
Marshall Clark ◽  
Hao Xu ◽  
...  

Background In a multisite clinical research collaboration, institutions may or may not use the same common data model (CDM) to store clinical data. To overcome this challenge, we proposed to use Health Level 7’s Fast Healthcare Interoperability Resources (FHIR) as a meta-CDM—a single standard to represent clinical data. Objective In this study, we aimed to create an open-source application termed the Clinical Asset Mapping Program for FHIR (CAMP FHIR) to efficiently transform clinical data to FHIR for supporting source-agnostic CDM-to-FHIR mapping. Methods Mapping with CAMP FHIR involves (1) mapping each source variable to its corresponding FHIR element and (2) mapping each item in the source data’s value sets to the corresponding FHIR value set item for variables with strict value sets. To date, CAMP FHIR has been used to transform 108 variables from the Informatics for Integrating Biology & the Bedside (i2b2) and Patient-Centered Outcomes Research Network data models to fields across 7 FHIR resources. It is designed to allow input from any source data model and will support additional FHIR resources in the future. Results We have used CAMP FHIR to transform data on approximately 23,000 patients with asthma from our institution’s i2b2 database. Data quality and integrity were validated against the origin point of the data, our enterprise clinical data warehouse. Conclusions We believe that CAMP FHIR can serve as an alternative to implementing new CDMs on a project-by-project basis. Moreover, the use of FHIR as a CDM could support rare data sharing opportunities, such as collaborations between academic medical centers and community hospitals. We anticipate adoption and use of CAMP FHIR to foster sharing of clinical data across institutions for downstream applications in translational research.


Author(s):  
Hanning Wang ◽  
Weixiang Xu ◽  
Chaolong Jia

Railway distributed system integration needs to realize information exchange, resources sharing and coordination process across fields, departments and application systems. And railway data integration is essential to implement this integration. In order to resolve the problem of heterogeneity of data models among data sources of different railway operation systems, this paper presents a novel integration data model of spatial structure, a XML-oriented 3-dimension common data model. The proposed model accommodates both the flexibility of level relationship and syntax expression in data integration. In this model, a spatial data pattern is used to describe and express the characteristic relationship of data items among all types of data. Based on the data model with rooted directed graph and the organization of level as well as the flexibility of the expression, the model can represent the mapping between different data models, including relationship model and object-oriented model. A consistent concept and algebraic description of the data set is given to function as the metadata in data integration, so that the algebraic manipulation of data integration is standardized to support the data integration of distributed system.


Author(s):  
Adrian Levy ◽  
Robert Platt ◽  
Soko Setoguchi ◽  
Jeffrey Brown ◽  
Michael Paterson

Over the past decade, characterizing the safety and effectiveness of drugs has advanced through distributed networks of data repositories where investigators implement the same procedures to address the same topic using a common data model. Distributed networks for pharmacoepidemiology have now been established in the United States (US), Globally/Europe Canada, and Asian countries. Sentinel in the US was developed in response to legislation and is funded by the US Food and Drug Administration to address their safety queries. The Observational Medical Outcomes Partnership (OMOP) is an international collaborative with a growing European data network that developed a common data model through a public-private partnership. The Canadian Network of Observational Drug Effect Studies (CNODES) receives funding and study queries from Health Canada and dissemination is directly back to the regulator as well as through the peer-reviewed literature. The Asian Pharmacoepidemiology Network (AsPEN) is an investigator-initiated multi-national research network formed to support the safety and effectiveness assessment of medications and other therapeutics and to facilitate the prompt identification and validation of emerging safety issues among the countries in Asia and Pacific regions. While these networks have implemented two different common data models (CNODES with Sentinel, ASPEN with OMOP), each network differs from the others in the aims, stage and implementation, operational approach, data quality assurance mechanisms, funding, and dissemination. The objectives of this session are to compare and contrast the role and goals, design principles, implementation approaches, and analytic conventions and procedures between common data models implemented by SENTINEL, OMOP, CNODES, ands AsPEN. Divided into seven 15-minute segments the session begins with an overview of distributed networks of common data models for pharmacoepidemiology. In four slides, each presenter then characterizes their network by describing the following: number of data holders, lives covered, and records, data holdings, data access model, network governance. process for transforming a repository’s data into the common data model target audience(s), process of identifying queries and knowledge dissemination plan two key challenges faced by the network and the lessons learned In identifying similarities and meaningful differences between the networks, in the next segment the discussant will articulate the relative strengths of the different approaches taken. This will lead into the last segment in which the floor will be opened for questions and comments from the audience. The session would be of benefit to researchers seeking to better understand or join an existing distributed network as well as researchers interested in broadening their understanding of global comparative effectiveness research.


Sign in / Sign up

Export Citation Format

Share Document