A Case-Based Retrieval System Using Natural Language Processing and Population-Based Visualization

P311 Detection and characterisation of extra-intestinal manifestations of IBD in clinical office notes using natural language processing

Journal of Crohn s and Colitis ◽

10.1093/ecco-jcc/jjz203.440 ◽

2020 ◽

Vol 14 (Supplement_1) ◽

pp. S309-S310

Author(s):

R Stidham ◽

D Yu ◽

S Lahiri ◽

V Vydiswaran

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Past History ◽

Model Development ◽

Population Based ◽

Equal Weight ◽

Disease Experience ◽

Status Classification ◽

Therapeutic Decision Making

Abstract Background Extra-Intestinal Manifestations (EIM) occur in nearly 40% of patients with IBD and impact both disease experience and therapeutic decision-making, but are not well captured by administrative codes. We aimed to pilot computational natural language processing (NLP) methods to characterise EIMs using consultant notes. Methods Subjects with a diagnosis of IBD were identified in a single-centre retrospective review of electronic health records (EHR) between 2014–2017. Gastroenterology (GI) notes were annotated by two reviewers for the presence and activity of EIMs. EIM concepts were identified using NLP methods leveraging UMLS libraries and hand-crafted features. EIM characterisation occurred within a ±25-word window around identified EIMs with classifications including inactive concepts (negated, historical, resolved) and active concepts (improved, worsened, active but unchanged). Decisions on EIM status when repeatedly referenced in a document used section-based weighting for status inference, with greatest to least weight ranking for assessment/plan, subjective, past history, exam, and other, respectively. EIM status was classified as ambiguous when multiple conflicting references were present within the same document of approximately equal weight. Model development and testing used an 80/20 dataset split. Results In 4108 unique IBD patients, 1640 (39.9%) had at least 1 EIM identified. The mean age was 41.9 years, 47.2% were male, and 27.0% had biologic exposure. A total of 1240 manually annotated documents (first GI notes) were comprised of 51.1% arthritis, 16.5% ocular, 16.2% psoriasis, with erythema nodosum (EN), pyoderma gangrenosum (PG), and hidradenitis suppurativa (HS) together comprising 16.2% of the cohort. NLP models performed well for correctly classifying both EIM presence and status in a testing set, with overall accuracy, sensitivity, and specificity of 91.2%, 92.9% and 81.8% across all EIMs in notes automatically classified as non-ambiguous (Table 1). NLP methods identified EIM status classification as ambiguous in 38.9% of cases. Conclusion NLP methods can detect and classify EIMs with reasonable performance and efficiency compared with traditional manual chart review. Though source document variation and ambiguity present challenges, NLP offers exciting possibilities for population-based research and decision support.

Download Full-text

Using natural language processing for identification of herpes zoster ophthalmicus cases to support population-based study

Clinical and Experimental Ophthalmology ◽

10.1111/ceo.13340 ◽

2018 ◽

Vol 47 (1) ◽

pp. 7-14 ◽

Cited By ~ 2

Author(s):

Chengyi Zheng ◽

Yi Luo ◽

Cheryl Mercado ◽

Lina Sy ◽

Steven J Jacobsen ◽

...

Keyword(s):

Natural Language Processing ◽

Herpes Zoster ◽

Natural Language ◽

Language Processing ◽

Population Based ◽

Herpes Zoster Ophthalmicus ◽

Population Based Study

Download Full-text

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) Using Natural Language Processing

10.1101/2021.05.05.21256555 ◽

2021 ◽

Author(s):

Chengyi Zheng ◽

Jonathan Duffy ◽

In-Lu Amy Liu ◽

Lina S. Sy ◽

Ronald A. Navarro ◽

...

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Population Based ◽

Care Organization ◽

Reference Standard ◽

Shoulder Injury ◽

Vaccine Administration

Background: Shoulder injury related to vaccine administration (SIRVA) accounts for more than half of all claims received by the National Vaccine Injury Compensation Program. However, there is a lack of population-based studies due to the challenge of identifying SIRVA cases in large health care databases. Objective: To develop a natural language processing (NLP) method to identify SIRVA cases from clinical notes. Methods: We conducted the study among members of a large integrated health care organization who were vaccinated between 04/1/2016 and 12/31/2017 and had subsequent diagnosis codes indicative of shoulder injury. Based on a training dataset with a chart review reference standard of 164 individuals, we developed an NLP algorithm to extract shoulder disorder information, including prior vaccination, anatomic location, temporality and causality. The algorithm identified three groups of positive SIRVA cases (definite, probable and possible) based on the strength of evidence. We compared NLP results to a chart review reference standard of 100 vaccinated individuals. We then applied the final automated NLP algorithm to a broader cohort of vaccinated individuals with a shoulder injury diagnosis code and performed manual chart confirmation on a random sample of NLP-identified definite cases and all NLP-identified probable and possible cases. Results: In the validation sample, the NLP algorithm had 100% accuracy for identifying 4 SIRVA cases and 96 individuals without SIRVA. In the broader cohort of 53,585 individuals, the NLP algorithm identified 291 definite, 124 probable, and 52 possible SIRVA cases. The chart-confirmation rates for these groups were 95.3%, 67.7% and 18.9%, respectively. Conclusions: The algorithm performed with high sensitivity and reasonable specificity in identifying positive SIRVA cases. The NLP algorithm can potentially be used in future population-based studies to identify this rare adverse event, avoiding labor-intensive chart review validation.

Download Full-text

P3.07-013 Determining EGFR and ALK Status in a Population-Based Cancer Registry: A Natural Language Processing Validation Study

Journal of Thoracic Oncology ◽

10.1016/j.jtho.2016.11.2204 ◽

2017 ◽

Vol 12 (1) ◽

pp. S1438 ◽

Cited By ~ 1

Author(s):

Bernardo Goulart ◽

Emily Silgard ◽

Christina Baik ◽

Aasthaa Bansal ◽

Mikael Greenwood-Hickman ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Cancer Registry ◽

Validation Study ◽

Population Based

Download Full-text

Using Case-Based Reasoning in Natural Language Processing

10.21236/ada273538 ◽

1993 ◽

Author(s):

Wendy G. Lehnert

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Case Based Reasoning ◽

Case Based

Download Full-text

Natural language processing and information retrieval system based on BP neural network

10.1109/iciscae52414.2021.9590737 ◽

2021 ◽

Author(s):

Zeyang Zheng

Keyword(s):

Neural Network ◽

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Bp Neural Network ◽

Retrieval System ◽

Information Retrieval System

Download Full-text

Population-Based Analysis of Histologically Confirmed Melanocytic Proliferations Using Natural Language Processing

JAMA Dermatology ◽

10.1001/jamadermatol.2017.4060 ◽

2018 ◽

Vol 154 (1) ◽

pp. 24 ◽

Cited By ~ 20

Author(s):

Jason P. Lott ◽

Denise M. Boudreau ◽

Ray L. Barnhill ◽

Martin A. Weinstock ◽

Eleanor Knopp ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Population Based

Download Full-text

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) Using Natural Language Processing (Preprint)

10.2196/preprints.30426 ◽

2021 ◽

Author(s):

Chengyi Zheng ◽

Jonathan Duffy ◽

In-Lu Amy Liu ◽

Lina S. Sy ◽

Ronald A. Navarro ◽

...

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Population Based ◽

Care Organization ◽

Reference Standard ◽

Shoulder Injury ◽

Vaccine Administration

BACKGROUND Shoulder injury related to vaccine administration (SIRVA) accounts for more than half of all claims received by the National Vaccine Injury Compensation Program. However, there is a lack of population-based studies due to the challenge of identifying SIRVA cases in large health care databases. OBJECTIVE To develop a natural language processing (NLP) method to identify SIRVA cases from clinical notes. METHODS We conducted the study among members of a large integrated health care organization who were vaccinated between 04/1/2016 and 12/31/2017 and had subsequent diagnosis codes indicative of shoulder injury. Based on a training dataset with a chart review reference standard of 164 individuals, we developed an NLP algorithm to extract shoulder disorder information, including prior vaccination, anatomic location, temporality and causality. The algorithm identified three groups of positive SIRVA cases (definite, probable and possible) based on the strength of evidence. We compared NLP results to a chart review reference standard of 100 vaccinated individuals. We then applied the final automated NLP algorithm to a broader cohort of vaccinated individuals with a shoulder injury diagnosis code and performed manual chart confirmation on a random sample of NLP-identified definite cases and all NLP-identified probable and possible cases. RESULTS In the validation sample, the NLP algorithm had 100% accuracy for identifying 4 SIRVA cases and 96 individuals without SIRVA. In the broader cohort of 53,585 individuals, the NLP algorithm identified 291 definite, 124 probable, and 52 possible SIRVA cases. The chart-confirmation rates for these groups were 95.3%, 67.7% and 18.9%, respectively. CONCLUSIONS The algorithm performed with high sensitivity and reasonable specificity in identifying positive SIRVA cases. The NLP algorithm can potentially be used in future population-based studies to identify this rare adverse event, avoiding labor-intensive chart review validation.

Download Full-text

COVID19: A Natural Language Processing and Ontology Oriented Temporal Case-Based Framework for Early Detection and Diagnosis of Novel Coronavirus

10.20944/preprints202005.0171.v1 ◽

2020 ◽

Author(s):

Olaide Nathaniel Oyelade ◽

Absalom E. Ezugwu

Keyword(s):

Natural Language Processing ◽

Early Detection ◽

Natural Language ◽

Language Processing ◽

World Health ◽

Case Based Reasoning ◽

The World ◽

Detection And Diagnosis ◽

Early Detection And Diagnosis ◽

Case Based

Coronavirus, also known as COVID-19, has been declared a pandemic by the World Health Organization (WHO). At the time of conducting this study, it had recorded over 1.6million cases while more than 105,000 have died due to it, with these figures rising on a daily basis across the globe. The burden of this highly contagious respiratory disease is that it presents itself in both symptomatic and asymptomatic patterns in those already infected, thereby leading to an exponential rise in the number of contractions of the disease and fatalities. It is therefore crucial to expedite the process of early detection and diagnosis of the disease across the world. The case-based reasoning (CBR) model is an effective paradigm that allows for the utilization of cases’ specific knowledge previously experienced, concrete problem situations or specific patient cases for solving new cases. This study therefore aims to leverage the very rich database of cases of COVID-19 to interpret and solve new cases even at their early stage to the advanced stage. The approach adopted in this study employs a natural language processing (NLP) technique to parse records of cases and thereafter formalize each case which is represented as a mini-ontology file. The formalized case is therefore parsed into a CBR model to allow for classification of the case into positive or negative to COVID-19. Meanwhile, feature extraction for each case is done by classifying tokens extracted by the NLP approach into special, temporal and thematic classes before encoding them using an ontology modeling method. The CBR model therefore leverages on the formalized features to compute the similarity of the new case with extracted similar cases from the archive of the CBR model. The proposed framework was populated with 68 cases obtained from the Italian Society of Medical and Interventional Radiology (SIRM) repository. Results obtained revealed that the proposed approach leverages on locations (spatial) and time (temporal) of contagion to successfully detect cases even in their early stages of two days onward before the incubation period of fourteen days. The proposed framework achieved an accuracy of 97.10%, sensitivity of 0.98 and specificity of .066. The study found that the proposed model can assist physicians to easily diagnose and isolate cases, thereby minimizing the rate of contagion and reducing false diagnosis as observed in some parts of the globe.

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text