SNOMED CT for processing of free-text in healthcare: a systematic scoping review (Preprint)

Mapping Intimacies ◽

10.2196/preprints.24594 ◽

2020 ◽

Author(s):

Christophe Gaudet-Blavignac ◽

Vasiliki Foufi ◽

Mina Bjelogrlic ◽

Christian Lovis

Keyword(s):

Scoping Review ◽

Text Processing ◽

Semantic Representation ◽

Extensive Study ◽

Free Text ◽

Snomed Ct ◽

Rule Based ◽

Universal Language ◽

Final Goal ◽

Secondary Usage

BACKGROUND Interoperability and secondary usage of data is a challenge in healthcare. Specifically, reuse of clinical free-text is an unresolved problem. SNOMED CT is growing into the universal language of healthcare and presents characteristics similar to a natural language. Its usage to represent clinical free-text could constitute a solution to improve interoperability. OBJECTIVE Although the usage of SNOMED and SNOMED CT has already been subject of review, its specific usage to process and represent unstructured data such as clinical free-text has not been the focus of an evaluation. This work aims at better understanding the use of SNOMED CT for NLP in medicine by reviewing its usage on clinical free-text. METHODS A scoping review has been performed on the topic, by searching on MedLine, Embase and Web of Science for publications featuring free-text processing and SNOMED CT. A recursive reference review was made to broaden the scope of the research. The review covered the type of data processed; the language targeted; the goal of the mapping to SNOMED CT; the method used; and finally, the specific software used. RESULTS A final set of 76 publications was selected for extensive study. The most frequent types of document are complementary exam reports (23.68%) and narrative notes (21.05%). The language focus is English in 90.79% of publications. The mapping to SNOMED CT is the final goal of the research in 21.05% of publications, part of the final goal in 32.89% and a step toward another goal in 46.05%.The main targets of the mapping to SNOMED CT are information extraction (38.94%), feature in a classification task (23.01%) and data normalization (20.35%). The method used for the mapping is rule-based in 69.74% of publications, manual in 14.47%, hybrid in 10.53%, and machine learning in 5.26%. 12 different software have been used to map text to SNOMED CT concepts, the most frequent being Medtex, MCVS and MTERMS. Full terminology was used in 64.47% of publications while only a subset of it was used in 30.26% publications. Post-coordination was proposed in 17.11% of publications and only 5.26% of publications mentioned specifically the usage of the SNOMED CT compositional grammar. CONCLUSIONS SNOMED CT has been largely used to process free-text data, most frequently with rule-based approaches, in English. However, to this date there is no easy solution for mapping free-text in to SNOMED CT concepts especially on languages different than English or if post-coordination is needed. Most of the solutions conceive SNOMED CT as a simple terminology rather than as a compositional bag of ontologies. Since 2012, the number of publications on this subject by year is decreasing. However, the need for formal semantic representation of free-text in healthcare is high and automatic encoding into a compositional ontology could be a way to achieve interoperability.

Download Full-text

SNOMED CT for processing of free text in healthcare: a systematic scoping review (Preprint)

Journal of Medical Internet Research ◽

10.2196/24594 ◽

2020 ◽

Author(s):

Christophe Gaudet-Blavignac ◽

Vasiliki Foufi ◽

Mina Bjelogrlic ◽

Christian Lovis

Keyword(s):

Scoping Review ◽

Free Text ◽

Snomed Ct

Download Full-text

Automated analysis of free-text comments and dashboard representations in patient experience surveys: a multimethod co-design study

Health Services and Delivery Research ◽

10.3310/hsdr07230 ◽

2019 ◽

Vol 7 (23) ◽

pp. 1-160

Author(s):

Carol Rivas ◽

Daria Tkacz ◽

Laurence Antao ◽

Emmanouil Mentzakis ◽

Margaret Gordon ◽

...

Keyword(s):

Health Care ◽

Patient Experience ◽

Scoping Review ◽

Health Care Professionals ◽

Technology Development ◽

Cost Benefit ◽

Rapid Review ◽

Free Text ◽

Data Sets ◽

Rule Based

BackgroundPatient experience surveys (PESs) often include informative free-text comments, but with no way of systematically, efficiently and usefully analysing and reporting these. The National Cancer Patient Experience Survey (CPES), used to model the approach reported here, generates > 70,000 free-text comments annually.Main aimTo improve the use and usefulness of PES free-text comments in driving health service changes that improve the patient experience.Secondary aims(1) To structure CPES free-text comments using rule-based information retrieval (IR) (‘text engineering’), drawing on health-care domain-specific gazetteers of terms, with in-built transferability to other surveys and conditions; (2) to display the results usefully for health-care professionals, in a digital toolkit dashboard display that drills down to the original free text; (3) to explore the usefulness of interdisciplinary mixed stakeholder co-design and consensus-forming approaches in technology development, ensuring that outputs have meaning for all; and (4) to explore the usefulness of Normalisation Process Theory (NPT) in structuring outputs for implementation and sustainability.DesignA scoping review, rapid review and surveys with stakeholders in health care (patients, carers, health-care providers, commissioners, policy-makers and charities) explored clinical dashboard design/patient experience themes. The findings informed the rules for the draft rule-based IR [developed using half of the 2013 Wales CPES (WCPES) data set] and prototype toolkit dashboards summarising PES data. These were refined following mixed stakeholder, concept-mapping workshops and interviews, which were structured to enable consensus-forming ‘co-design’ work. IR validation used the second half of the WCPES, with comparison against its manual analysis; transferability was tested using further health-care data sets. A discrete choice experiment (DCE) explored which toolkit features were preferred by health-care professionals, with a simple cost–benefit analysis. Structured walk-throughs with NHS managers in Wessex, London and Leeds explored usability and general implementation into practice.Key outcomesA taxonomy of ranked PES themes, a checklist of key features recommended for digital clinical toolkits, rule-based IR validation and transferability scores, usability, and goal-oriented, cost–benefit and marketability results. The secondary outputs were a survey, scoping and rapid review findings, and concordance and discordance between stakeholders and methods.Results(1) The surveys, rapid review and workshops showed that stakeholders differed in their understandings of the patient experience and priorities for change, but that they reached consensus on a shortlist of 19 themes; six were considered to be core; (2) the scoping review and one survey explored the clinical toolkit design, emphasising that such toolkits should be quick and easy to use, and embedded in workflows; the workshop discussions, the DCE and the walk-throughs confirmed this and foregrounded other features to form the toolkit design checklist; and (3) the rule-based IR, developed using noun and verb phrases and lookup gazetteers, was 86% accurate on the WCPES, but needs modification to improve this and to be accurate with other data sets. The DCE and the walk-through suggest that the toolkit would be well accepted, with a favourable cost–benefit ratio, if implemented into practice with appropriate infrastructure support.LimitationsSmall participant numbers and sampling bias across component studies. The scoping review studies mostly used top-down approaches and focused on professional dashboards. The rapid review of themes had limited scope, with no second reviewer. The IR needs further refinement, especially for transferability. New governance restrictions further limit immediate use.ConclusionsUsing a multidisciplinary, mixed stakeholder, use of co-design, proof of concept was shown for an automated display of patient experience free-text comments in a way that could drive health-care improvements in real time. The approach is easily modified for transferable application.Future workFurther exploration is needed of implementation into practice, transferable uses and technology development co-design approaches.FundingThe National Institute for Health Research Health Services and Delivery Research programme.

Download Full-text

The use of a Domain Ontology for the Management of Essential Hypertension (Preprint)

10.2196/preprints.25427 ◽

2020 ◽

Author(s):

Emma Chavez ◽

Vanessa Perez ◽

Angélica Urrutia

Keyword(s):

Decision Making ◽

Essential Hypertension ◽

Medical History ◽

Medical Records ◽

Medical Training ◽

Relevant Information ◽

Domain Ontology ◽

Free Text ◽

Snomed Ct ◽

History Of

BACKGROUND : Currently, hypertension is one of the diseases with greater risk of mortality in the world. Particularly in Chile, 90% of the population with this disease has idiopathic or essential hypertension. Essential hypertension is characterized by high blood pressure rates and it´s cause is unknown, which means that every patient might requires a different treatment, depending on their history and symptoms. Different data, such as history, symptoms, exams, etc., are generated for each patient suffering from the disease. This data is presented in the patient’s medical record, in no order, making it difficult to search for relevant information. Therefore, there is a need for a common, unified vocabulary of the terms that adequately represent the diseased, making searching within the domain more effective. OBJECTIVE The objective of this study is to develop a domain ontology for essential hypertension , therefore arranging the more significant data within the domain as tool for medical training or to support physicians’ decision making will be provided. METHODS The terms used for the ontology were extracted from the medical history of de-identified medical records, of patients with essential hypertension. The Snomed-CT’ collection of medical terms, and clinical guidelines to control the disease were also used. Methontology was used for the design, classes definition and their hierarchy, as well as relationships between concepts and instances. Three criteria were used to validate the ontology, which also helped to measure its quality. Tests were run with a dataset to verify that the tool was created according to the requirements. RESULTS An ontology of 310 instances classified into 37 classes was developed. From these, 4 super classes and 30 relationships were obtained. In the dataset tests, 100% correct and coherent answers were obtained for quality tests (3). CONCLUSIONS The development of this ontology provides a tool for physicians, specialists, and students, among others, that can be incorporated into clinical systems to support decision making regarding essential hypertension. Nevertheless, more instances should be incorporated into the ontology by carrying out further searched in the medical history or free text sections of the medical records of patients with this disease.

Download Full-text

Trialstreamer: a living, automatically updated database of clinical trial reports

10.1101/2020.05.15.20103044 ◽

2020 ◽

Author(s):

Iain J Marshall ◽

Benjamin Nye ◽

Joël Kuiper ◽

Anna Noel-Storr ◽

Rachel Marshall ◽

...

Keyword(s):

Automated System ◽

Free Text ◽

Gold Standard Method ◽

Rule Based ◽

Randomized Controlled ◽

Structured Information ◽

Combine Machine ◽

Clinical Queries ◽

Clinical Trials Registry ◽

Extract Information

Objective Randomized controlled trials (RCTs) are the gold standard method for evaluating whether a treatment works in healthcare, but can be difficult to find and make use of. We describe the development and evaluation of a system to automatically find and categorize all new RCT reports. Materials and Methods Trialstreamer, continuously monitors PubMed and the WHO International Clinical Trials Registry Platform (ICTRP), looking for new RCTs in humans using a validated classifier. We combine machine learning and rule-based methods to extract information from the RCT abstracts, including free-text descriptions of trial populations, interventions and outcomes (the 'PICO') and map these snippets to normalised MeSH vocabulary terms. We additionally identify sample sizes, predict the risk of bias, and extract text conveying key findings. We store all extracted data in a database which we make freely available for download, and via a search portal, which allows users to enter structured clinical queries. Results are ranked automatically to prioritize larger and higher-quality studies. Results As of May 2020, we have indexed 669,895 publications of RCTs, of which 18,485 were published in the first four months of 2020 (144/day). We additionally include 303,319 trial registrations from ICTRP. The median trial sample size in the RCTs was 66. Conclusions We present an automated system for finding and categorising RCTs. This yields a novel resource: A database of structured information automatically extracted for all published RCTs in humans. We make daily updates of this database available on our website (trialstreamer.robotreviewer.net).

Download Full-text

Utility of Different Data Standards to Document Adverse Drug Event Symptoms and Diagnoses: A Mixed Methods Study (Preprint)

10.2196/preprints.27188 ◽

2021 ◽

Author(s):

Erina Chan ◽

Serena S Small ◽

Maeve E Wickham ◽

Vicki Cheng ◽

Ellen Balka ◽

...

Keyword(s):

Mixed Methods ◽

Adverse Drug Event ◽

Adverse Drug Events ◽

Retrospective Chart Review ◽

Drug Event ◽

Free Text ◽

Snomed Ct ◽

Data Standards ◽

Data Standard ◽

Research Assistants

BACKGROUND Existing systems to document adverse drug events often use free text data entry, producing non-standardized, unstructured data prone to misinterpretation. Standardized terminology may improve data quality, but it is unclear which data standard is most appropriate to document adverse drug event symptoms and diagnoses. OBJECTIVE Our objective was to compare the utility, strengths, and weaknesses of different data standards for documenting adverse drug event symptoms and diagnoses. METHODS We performed a mixed-methods sub-study of a multicenter retrospective chart review. We reviewed research records of prospectively diagnosed adverse drug events at 5 Canadian hospitals. Two pharmacy research assistants independently entered symptoms and/or diagnoses for adverse drug events using 4 standards: MedDRA, SNOMED CT, SNOMED Adverse Reaction, and ICD-11. Disagreements between research assistants regarding case-specific utility of data standards were discussed until reaching consensus. We used consensus ratings to determine proportion of adverse drug events covered by a data standard, and coded and analyzed field notes from consensus sessions. RESULTS We reviewed 573 adverse drug events and found MedDRA and ICD-11 had excellent coverage of adverse drug event symptoms and/or diagnoses. While MedDRA had the highest number of matches between the research assistants, ICD-11 had the fewest. SNOMED ADR had the lowest proportion of adverse drug event coverage. Research assistants were most likely to encounter terminological challenges with SNOMED ADR and usability challenges with ICD-11, and least likely with MedDRA. CONCLUSIONS Usability, comprehensiveness, and accuracy are important features of data standards for documenting ADE symptoms and diagnoses. Based on our results, we would recommend the use of MedDRA.

Download Full-text

Semantic Representation and Rule Based Patterns Discovery and Verification in eProcurement Business Processes for eGovernment

Complex, Intelligent and Software Intensive Systems - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-79725-6_67 ◽

2021 ◽

pp. 667-676

Author(s):

Beniamino Di Martino ◽

Datiana Cascone ◽

Luigi Colucci Cante ◽

Antonio Esposito

Keyword(s):

Business Processes ◽

Semantic Representation ◽

Rule Based

Download Full-text

Free-Text Processing To Enhance Surveillance of Acute Respiratory Infections.

10.1164/ajrccm-conference.2009.179.1_meetingabstracts.a1724 ◽

2009 ◽

Author(s):

BS Kim ◽

B South ◽

M Samore ◽

S DeLisle

Keyword(s):

Respiratory Infections ◽

Text Processing ◽

Free Text ◽

Acute Respiratory Infections

Download Full-text

Education as a strategy for managing occupational-related musculoskeletal pain: a scoping review

BMJ Open ◽

10.1136/bmjopen-2019-032668 ◽

2020 ◽

Vol 10 (2) ◽

pp. e032668 ◽

Cited By ~ 3

Author(s):

Thorvaldur Skuli Palsson ◽

Shellie Boudreau ◽

Morten Høgh ◽

Pablo Herrero ◽

Pablo Bellosta-Lopez ◽

...

Keyword(s):

Scoping Review ◽

Controlled Trial ◽

Behavioural Therapy ◽

Cochrane Library ◽

Free Text ◽

Work Related ◽

Education Material ◽

Occupational Setting ◽

Free Text Search ◽

Randomised Controlled

BackgroundMusculoskeletal (MSK) pain is the primary contributor to disability worldwide. There is a growing consensus that MSK pain is a recurrent multifactorial condition underpinned by health and lifestyle factors. Studies suggest that education on work-related pain and individualised advice could be essential and effective for managing persistent MSK pain.ObjectiveThe objective of this scoping review was to map the existing educational resources for work-related MSK (WRMSK) pain, and the effects of implementing educational strategies in the workplace on managing WRMSK pain.MethodsThis scoping review assessed original studies that implemented and assessed education as a strategy to manage WMSK pain. Literature search strategies were developed using thesaurus headings (ie, MeSH and CINAHL headings) and free-text search including words related to MSK in an occupational setting. The search was carried out in PubMed, CINAHL, Cochrane Library and Web of Science in the period 12–14 February 2019.ResultsA total of 19 peer-reviewed articles were included and the study design, aim and outcomes were summarised. Of the 19 peer-reviewed articles, 10 randomised controlled trial (RCT) studies assessed the influence of education on work-related MSK pain. Many studies provided a limited description of the education material and assessed/used different methods of delivery. A majority of studies concluded education positively influences work-related MSK pain. Further, some studies reported additive effects of physical activity or ergonomic adjustments.ConclusionsThere is a gap in knowledge regarding the best content and delivery of education of material in the workplace. Although beneficial outcomes were reported, more RCT studies are required to determine the effects of education material as compared with other interventions, such as exercise or behavioural therapy.

Download Full-text

Extracting Structured Genotype Information from Free-Text HLA Reports Using a Rule-Based Approach

Journal of Korean Medical Science ◽

10.3346/jkms.2020.35.e78 ◽

2020 ◽

Vol 35 (12) ◽

Author(s):

Kye Hwa Lee ◽

Hyo Jung Kim ◽

Yi-Jun Kim ◽

Ju Han Kim ◽

Eun Young Song

Keyword(s):

Free Text ◽

Rule Based ◽

Genotype Information ◽

Rule Based Approach

Download Full-text

Documentation of social determinants in electronic health records with and without standardized terminologies: A comparative study

Proceedings of Singapore Healthcare ◽

10.1177/2010105818785641 ◽

2018 ◽

Vol 28 (1) ◽

pp. 39-47 ◽

Cited By ~ 1

Author(s):

Karen A Monsen ◽

Joyce M Rudenick ◽

Nicole Kapinos ◽

Kathryn Warmbold ◽

Siobhan K McMahon ◽

...

Keyword(s):

Electronic Health Records ◽

Free Text ◽

Snomed Ct ◽

Health Records ◽

Behavioral Determinants ◽

Omaha System ◽

Standardized Terminology ◽

Electronic Health ◽

Data Elements ◽

Improve Health

Background: Electronic health records (EHRs) are a promising new source of population health data that may improve health outcomes. However, little is known about the extent to which social and behavioral determinants of health (SBDH) are currently documented in EHRs, including how SBDH are documented, and by whom. Standardized nursing terminologies have been developed to assess and document SBDH. Objective: We examined the documentation of SBDH in EHRs with and without standardized nursing terminologies. Methods: We carried out a review of the literature for SBDH phrases organized by topic, which were used for analyses. Key informant interviews were conducted regarding SBDH phrases. Results: In nine EHRs (six acute care, three community care) 107 SBDH phrases were documented using free text, structured text, and standardized terminologies in diverse screens and by multiple clinicians, admitting personnel, and other staff. SBDH phrases were documented using one of three standardized terminologies ( N = average number of phrases per terminology per EHR): ICD-9/10 ( N = 1); SNOMED CT ( N = 1); Omaha System ( N = 79). Most often, standardized terminology data were documented by nurses or other clinical staff versus receptionists or other non-clinical personnel. Documentation ‘unknown’ differed significantly between EHRs with and without the Omaha System (mean = 26.0 (standard deviation (SD) = 8.7) versus mean = 74.5 (SD = 16.5)) ( p = .005). SBDH documentation in EHRs differed based on the presence of a nursing terminology. Conclusions: The Omaha System enabled a more comprehensive, holistic assessment and documentation of interoperable SBDH data. Further research is needed to determine SBDH data elements that are needed across settings, the uses of SBDH data in practice, and to examine patient perspectives related to SBDH assessments.

Download Full-text