scholarly journals SNOMED CT for processing of free text in healthcare: a systematic scoping review (Preprint)

Author(s):  
Christophe Gaudet-Blavignac ◽  
Vasiliki Foufi ◽  
Mina Bjelogrlic ◽  
Christian Lovis
Keyword(s):  
2020 ◽  
Author(s):  
Christophe Gaudet-Blavignac ◽  
Vasiliki Foufi ◽  
Mina Bjelogrlic ◽  
Christian Lovis

BACKGROUND Interoperability and secondary usage of data is a challenge in healthcare. Specifically, reuse of clinical free-text is an unresolved problem. SNOMED CT is growing into the universal language of healthcare and presents characteristics similar to a natural language. Its usage to represent clinical free-text could constitute a solution to improve interoperability. OBJECTIVE Although the usage of SNOMED and SNOMED CT has already been subject of review, its specific usage to process and represent unstructured data such as clinical free-text has not been the focus of an evaluation. This work aims at better understanding the use of SNOMED CT for NLP in medicine by reviewing its usage on clinical free-text. METHODS A scoping review has been performed on the topic, by searching on MedLine, Embase and Web of Science for publications featuring free-text processing and SNOMED CT. A recursive reference review was made to broaden the scope of the research. The review covered the type of data processed; the language targeted; the goal of the mapping to SNOMED CT; the method used; and finally, the specific software used. RESULTS A final set of 76 publications was selected for extensive study. The most frequent types of document are complementary exam reports (23.68%) and narrative notes (21.05%). The language focus is English in 90.79% of publications. The mapping to SNOMED CT is the final goal of the research in 21.05% of publications, part of the final goal in 32.89% and a step toward another goal in 46.05%.The main targets of the mapping to SNOMED CT are information extraction (38.94%), feature in a classification task (23.01%) and data normalization (20.35%). The method used for the mapping is rule-based in 69.74% of publications, manual in 14.47%, hybrid in 10.53%, and machine learning in 5.26%. 12 different software have been used to map text to SNOMED CT concepts, the most frequent being Medtex, MCVS and MTERMS. Full terminology was used in 64.47% of publications while only a subset of it was used in 30.26% publications. Post-coordination was proposed in 17.11% of publications and only 5.26% of publications mentioned specifically the usage of the SNOMED CT compositional grammar. CONCLUSIONS SNOMED CT has been largely used to process free-text data, most frequently with rule-based approaches, in English. However, to this date there is no easy solution for mapping free-text in to SNOMED CT concepts especially on languages different than English or if post-coordination is needed. Most of the solutions conceive SNOMED CT as a simple terminology rather than as a compositional bag of ontologies. Since 2012, the number of publications on this subject by year is decreasing. However, the need for formal semantic representation of free-text in healthcare is high and automatic encoding into a compositional ontology could be a way to achieve interoperability.


2020 ◽  
Author(s):  
Emma Chavez ◽  
Vanessa Perez ◽  
Angélica Urrutia

BACKGROUND : Currently, hypertension is one of the diseases with greater risk of mortality in the world. Particularly in Chile, 90% of the population with this disease has idiopathic or essential hypertension. Essential hypertension is characterized by high blood pressure rates and it´s cause is unknown, which means that every patient might requires a different treatment, depending on their history and symptoms. Different data, such as history, symptoms, exams, etc., are generated for each patient suffering from the disease. This data is presented in the patient’s medical record, in no order, making it difficult to search for relevant information. Therefore, there is a need for a common, unified vocabulary of the terms that adequately represent the diseased, making searching within the domain more effective. OBJECTIVE The objective of this study is to develop a domain ontology for essential hypertension , therefore arranging the more significant data within the domain as tool for medical training or to support physicians’ decision making will be provided. METHODS The terms used for the ontology were extracted from the medical history of de-identified medical records, of patients with essential hypertension. The Snomed-CT’ collection of medical terms, and clinical guidelines to control the disease were also used. Methontology was used for the design, classes definition and their hierarchy, as well as relationships between concepts and instances. Three criteria were used to validate the ontology, which also helped to measure its quality. Tests were run with a dataset to verify that the tool was created according to the requirements. RESULTS An ontology of 310 instances classified into 37 classes was developed. From these, 4 super classes and 30 relationships were obtained. In the dataset tests, 100% correct and coherent answers were obtained for quality tests (3). CONCLUSIONS The development of this ontology provides a tool for physicians, specialists, and students, among others, that can be incorporated into clinical systems to support decision making regarding essential hypertension. Nevertheless, more instances should be incorporated into the ontology by carrying out further searched in the medical history or free text sections of the medical records of patients with this disease.


2021 ◽  
Author(s):  
Erina Chan ◽  
Serena S Small ◽  
Maeve E Wickham ◽  
Vicki Cheng ◽  
Ellen Balka ◽  
...  

BACKGROUND Existing systems to document adverse drug events often use free text data entry, producing non-standardized, unstructured data prone to misinterpretation. Standardized terminology may improve data quality, but it is unclear which data standard is most appropriate to document adverse drug event symptoms and diagnoses. OBJECTIVE Our objective was to compare the utility, strengths, and weaknesses of different data standards for documenting adverse drug event symptoms and diagnoses. METHODS We performed a mixed-methods sub-study of a multicenter retrospective chart review. We reviewed research records of prospectively diagnosed adverse drug events at 5 Canadian hospitals. Two pharmacy research assistants independently entered symptoms and/or diagnoses for adverse drug events using 4 standards: MedDRA, SNOMED CT, SNOMED Adverse Reaction, and ICD-11. Disagreements between research assistants regarding case-specific utility of data standards were discussed until reaching consensus. We used consensus ratings to determine proportion of adverse drug events covered by a data standard, and coded and analyzed field notes from consensus sessions. RESULTS We reviewed 573 adverse drug events and found MedDRA and ICD-11 had excellent coverage of adverse drug event symptoms and/or diagnoses. While MedDRA had the highest number of matches between the research assistants, ICD-11 had the fewest. SNOMED ADR had the lowest proportion of adverse drug event coverage. Research assistants were most likely to encounter terminological challenges with SNOMED ADR and usability challenges with ICD-11, and least likely with MedDRA. CONCLUSIONS Usability, comprehensiveness, and accuracy are important features of data standards for documenting ADE symptoms and diagnoses. Based on our results, we would recommend the use of MedDRA.


BMJ Open ◽  
2020 ◽  
Vol 10 (2) ◽  
pp. e032668 ◽  
Author(s):  
Thorvaldur Skuli Palsson ◽  
Shellie Boudreau ◽  
Morten Høgh ◽  
Pablo Herrero ◽  
Pablo Bellosta-Lopez ◽  
...  

BackgroundMusculoskeletal (MSK) pain is the primary contributor to disability worldwide. There is a growing consensus that MSK pain is a recurrent multifactorial condition underpinned by health and lifestyle factors. Studies suggest that education on work-related pain and individualised advice could be essential and effective for managing persistent MSK pain.ObjectiveThe objective of this scoping review was to map the existing educational resources for work-related MSK (WRMSK) pain, and the effects of implementing educational strategies in the workplace on managing WRMSK pain.MethodsThis scoping review assessed original studies that implemented and assessed education as a strategy to manage WMSK pain. Literature search strategies were developed using thesaurus headings (ie, MeSH and CINAHL headings) and free-text search including words related to MSK in an occupational setting. The search was carried out in PubMed, CINAHL, Cochrane Library and Web of Science in the period 12–14 February 2019.ResultsA total of 19 peer-reviewed articles were included and the study design, aim and outcomes were summarised. Of the 19 peer-reviewed articles, 10 randomised controlled trial (RCT) studies assessed the influence of education on work-related MSK pain. Many studies provided a limited description of the education material and assessed/used different methods of delivery. A majority of studies concluded education positively influences work-related MSK pain. Further, some studies reported additive effects of physical activity or ergonomic adjustments.ConclusionsThere is a gap in knowledge regarding the best content and delivery of education of material in the workplace. Although beneficial outcomes were reported, more RCT studies are required to determine the effects of education material as compared with other interventions, such as exercise or behavioural therapy.


2018 ◽  
Vol 28 (1) ◽  
pp. 39-47 ◽  
Author(s):  
Karen A Monsen ◽  
Joyce M Rudenick ◽  
Nicole Kapinos ◽  
Kathryn Warmbold ◽  
Siobhan K McMahon ◽  
...  

Background: Electronic health records (EHRs) are a promising new source of population health data that may improve health outcomes. However, little is known about the extent to which social and behavioral determinants of health (SBDH) are currently documented in EHRs, including how SBDH are documented, and by whom. Standardized nursing terminologies have been developed to assess and document SBDH. Objective: We examined the documentation of SBDH in EHRs with and without standardized nursing terminologies. Methods: We carried out a review of the literature for SBDH phrases organized by topic, which were used for analyses. Key informant interviews were conducted regarding SBDH phrases. Results: In nine EHRs (six acute care, three community care) 107 SBDH phrases were documented using free text, structured text, and standardized terminologies in diverse screens and by multiple clinicians, admitting personnel, and other staff. SBDH phrases were documented using one of three standardized terminologies ( N = average number of phrases per terminology per EHR): ICD-9/10 ( N = 1); SNOMED CT ( N = 1); Omaha System ( N = 79). Most often, standardized terminology data were documented by nurses or other clinical staff versus receptionists or other non-clinical personnel. Documentation ‘unknown’ differed significantly between EHRs with and without the Omaha System (mean = 26.0 (standard deviation (SD) = 8.7) versus mean = 74.5 (SD = 16.5)) ( p = .005). SBDH documentation in EHRs differed based on the presence of a nursing terminology. Conclusions: The Omaha System enabled a more comprehensive, holistic assessment and documentation of interoperable SBDH data. Further research is needed to determine SBDH data elements that are needed across settings, the uses of SBDH data in practice, and to examine patient perspectives related to SBDH assessments.


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Christine Kersting ◽  
Julia Hülsmann ◽  
Klaus Weckbecker ◽  
Achim Mortsiefer

Abstract Background To be able to make informed choices based on their individual preferences, patients need to be adequately informed about treatment options and their potential outcomes. This implies that studies measure the effects of care based on parameters that are relevant to patients. In a previous scoping review, we found a wide variety of supposedly patient-relevant parameters that equally addressed processes and outcomes of care. We were unable to identify a consistent understanding of patient relevance and therefore aimed to develop an empirically based concept including a generic set of patient-relevant parameters. As a first step we evaluated the process and outcome parameters identified in the scoping review from the patients’ perspective. Methods We conducted a cross-sectional survey among German general practice patients. Ten research practices of Witten/Herdecke University supported the study. During a two-week period in the fall of 2020, patients willing to participate self-administered a short questionnaire. It evaluated the relevance of the 32 parameters identified in the scoping review on a 5-point Likert scale and offered a free-text field for additional parameters. These free-text answers were inductively categorized by two researchers. Quantitative data were analyzed using descriptive statistics. Bivariate analyses were performed to determine whether there are any correlations between rating a parameter as highly relevant and patients’ characteristics. Results Data from 299 patients were eligible for analysis. All outcomes except ‘sexuality’ and ‘frequency of healthcare service utilization’ were rated important. ‘Confidence in therapy’ was rated most important, followed by ‘prevention of comorbidity’ and ‘mobility’. Relevance ratings of five parameters were associated with patients’ age and gender, but not with their chronic status. The free-text analysis revealed 15 additional parameters, 12 of which addressed processes of care, i.e., ‘enough time in physician consultation’. Conclusion Patients attach great value to parameters addressing processes of care. It appears as though the way in which patients experience the care process is not less relevant than what comes of it. Relevance ratings were not associated with chronic status, but few parameters were gender- and age-related. Trial registration Core Outcome Measures in Effectiveness Trials Initiative, registration number: 1685.


2021 ◽  
Author(s):  
Christophe Gaudet-Blavignac ◽  
Andrea Rudaz ◽  
Christian Lovis

BACKGROUND Since the creation of the Problem Oriented Medical Record, the building of problem lists has been the focus of many researches. To this day, this issue is not well resolved, and building an appropriate contextualized problem list is still a challenge. OBJECTIVE This paper presents the process of building a shared multi-purpose common problem list at the University Hospitals of Geneva, a consortium of all public hospitals and 30 outpatient clinics of the state of Geneva. This list aims at bridging the gap between clinicians’ language expressed in free text and secondary usages requiring structured information. METHODS The strategy focuses on the needs of clinicians by building a list of uniquely identified expressions to support their daily activities. In a second stage, these expressions are connected to additional information, building a complex graph of information. A list of 45,946 expressions manually extracted from clinical documents has been manually curated and encoded in multiple semantic dimensions, such as ICD-10, ICPC-2, SNOMED-CT or dimensions dictated by specific usages, such as identifying expressions specific to a domain, a gender, or an intervention. The list has been progressively deployed for clinicians with an iterative process of quality control, maintenance and improvements, including addition of new expressions, or dimensions for specific needs. The problem management of the electronic health record allowed to measure and correct the encoding based on real-world usage. RESULTS The list was deployed in production in January 2017 and was regularly updated and deployed in new divisions of the hospital. In 4 years, 684,102 problems were created using the list. The proportion of free text entries reduced progressively from 37.47% (8,321/22,206) in December 2017 to 18.38% (4,547/24,738) in December 2020. In the last version of the list, over 14 dimensions were mapped to expressions, among them 5 international classifications and 8 other classifications for specific usages. The list became a central axis in the EHR, being used for many different purposes linked to care such as surgical planning or emergency wards, or in research, for various predictions using machine learning techniques. CONCLUSIONS This work breaks with common approaches primarily by focusing on real clinicians’ language when expressing patient’s problems and secondly by mapping whatever is required, including controlled vocabularies to answer specific needs. This approach improves the quality of the expression of patients’ problems, while allowing to build as many structured dimensions as needed to convey semantics according to specific contexts. The method is shown to be scalable, sustainable and efficient at hiding the complexity of semantics or the burden of constraint structured problem list entry for clinicians. Ongoing work is analyzing the impact of this approach at influencing how clinicians express patient’s problems.


2014 ◽  
Vol 05 (02) ◽  
pp. 349-367 ◽  
Author(s):  
Y. Lu ◽  
C.J. Vitale ◽  
P.L. Mar ◽  
F. Chang ◽  
N. Dhopeshwarkar ◽  
...  

SummaryBackground: The ability to manage and leverage family history information in the electronic health record (EHR) is crucial to delivering high-quality clinical care.Objectives: We aimed to evaluate existing standards in representing relative information, examine this information documented in EHRs, and develop a natural language processing (NLP) application to extract relative information from free-text clinical documents.Methods: We reviewed a random sample of 100 admission notes and 100 discharge summaries of 198 patients, and also reviewed the structured entries for these patients in an EHR system’s family history module. We investigated the two standards used by Stage 2 of Meaningful Use (SNOMED CT and HL7 Family History Standard) and identified coverage gaps of each standard in coding relative information. Finally, we evaluated the performance of the MTERMS NLP system in identifying relative information from free-text documents.Results: The structure and content of SNOMED CT and HL7 for representing relative information are different in several ways. Both terminologies have high coverage to represent local relative concepts built in an ambulatory EHR system, but gaps in key concept coverage were detected; coverage rates for relative information in free-text clinical documents were 95.2% and 98.6%, respectively. Compared to structured entries, richer family history information was only available in free-text documents. Using a comprehensive lexicon that included concepts and terms of relative information from different sources, we expanded the MTERMS NLP system to extract and encode relative information in clinical documents and achieved a corresponding precision of 100% and recall of 97.4%.Conclusions: Comprehensive assessment and user guidance are critical to adopting standards into EHR systems in a meaningful way. A significant portion of patients’ family history information is only documented in free-text clinical documents and NLP can be used to extract this information.Citation: Zhou L, Lu Y, Vitale CJ, Mar PL, Chang F, Dhopeshwarkar N, Rocha RA. Representation of information about family relatives as structured data in electronic health records. Appl Clin Inf 2014; 5: 349–367 http://dx.doi.org/10.4338/ACI-2013-10-RA-0080


2020 ◽  
Vol 134 ◽  
pp. 104035 ◽  
Author(s):  
Junglyun Kim ◽  
Tamara G.R. Macieira ◽  
Sarah L. Meyer ◽  
Margaret Ansell (Maggie) ◽  
Ragnhildur I. Bjarnadottir (Raga) ◽  
...  

Author(s):  
Iuliia D. Lenivtceva ◽  
Georgy Kopanitsa

Abstract Background The larger part of essential medical knowledge is stored as free text which is complicated to process. Standardization of medical narratives is an important task for data exchange, integration, and semantic interoperability. Objectives The article aims to develop the end-to-end pipeline for structuring Russian free-text allergy anamnesis using international standards. Methods The pipeline for free-text data standardization is based on FHIR (Fast Healthcare Interoperability Resources) and SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) to ensure semantic interoperability. The pipeline solves common tasks such as data preprocessing, classification, categorization, entities extraction, and semantic codes assignment. Machine learning methods, rule-based, and dictionary-based approaches were used to compose the pipeline. The pipeline was evaluated on 166 randomly chosen medical records. Results AllergyIntolerance resource was used to represent allergy anamnesis. The module for data preprocessing included the dictionary with over 90,000 words, including specific medication terms, and more than 20 regular expressions for errors correction, classification, and categorization modules resulted in four dictionaries with allergy terms (total 2,675 terms), which were mapped to SNOMED CT concepts. F-scores for different steps are: 0.945 for filtering, 0.90 to 0.96 for allergy categorization, 0.90 and 0.93 for allergens reactions extraction, respectively. The allergy terminology coverage is more than 95%. Conclusion The proposed pipeline is a step to ensure semantic interoperability of Russian free-text medical records and could be effective in standardization systems for further data exchange and integration.


Sign in / Sign up

Export Citation Format

Share Document