Using natural language processing for identification of herpes zoster ophthalmicus cases to support population-based study

Chengyi Zheng; Yi Luo; Cheryl Mercado; Lina Sy; Steven J Jacobsen; Brad Ackerson; Bruno Lewin; Hung Fu Tseng

doi:10.1111/ceo.13340

Text-based Identification of Herpes Zoster Ophthalmicus with Ocular Involvement in the Electronic Health Record: A Population-based Study

Open Forum Infectious Diseases ◽

10.1093/ofid/ofaa652 ◽

2021 ◽

Author(s):

Chengyi Zheng ◽

Lina S Sy ◽

Hilary Tanenbaum ◽

Yun Tian ◽

Yi Luo ◽

...

Keyword(s):

Herpes Zoster ◽

Language Processing ◽

Large Population ◽

Population Based ◽

Ocular Involvement ◽

Herpes Zoster Ophthalmicus ◽

Population Based Study ◽

Diagnosis Codes ◽

Increased Risk ◽

Icd Codes

Abstract Background Diagnosis codes are inadequate for accurately identifying herpes zoster ophthalmicus. Manual review of medical records is expensive and time-consuming, resulting in a lack of population-based data on herpes zoster ophthalmicus. Methods We conducted a retrospective cohort study, including 87,673 patients aged ≥50 years who had a new HZ diagnosis and associated antiviral prescription between 2010-2018. We developed and validated an automated natural language processing (NLP) algorithm to identify herpes zoster ophthalmicus (HZO) with ocular involvement (ocular HZO). We compared the characteristics of NLP-identified ocular HZO, nonocular HZO, and non-HZO cases among HZ patients and identified the factors associated with ocular HZO among HZ patients. Results The NLP algorithm achieved 94.9% sensitivity and 94.2% specificity in identifying ocular HZO cases. Among 87,673 incident HZ cases, the proportion identified as ocular HZO was 9.0% (n=7,853) by NLP and 2.3% (n=1,988) by ICD codes. In adjusted analyses, older age and male sex were associated with an increased risk of ocular HZO; Hispanic and Black race/ethnicity each were associated with a lower risk of ocular HZO compared to non-Hispanic White. Conclusions The NLP algorithm achieved high accuracy and can be used in large population-based studies to identify ocular HZO, avoiding labor-intensive chart review. Age, sex, and race were strongly associated with ocular HZO among HZ patients. We should consider these risk factors when planning for zoster vaccination.

Download Full-text

P311 Detection and characterisation of extra-intestinal manifestations of IBD in clinical office notes using natural language processing

Journal of Crohn s and Colitis ◽

10.1093/ecco-jcc/jjz203.440 ◽

2020 ◽

Vol 14 (Supplement_1) ◽

pp. S309-S310

Author(s):

R Stidham ◽

D Yu ◽

S Lahiri ◽

V Vydiswaran

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Past History ◽

Model Development ◽

Population Based ◽

Equal Weight ◽

Disease Experience ◽

Status Classification ◽

Therapeutic Decision Making

Abstract Background Extra-Intestinal Manifestations (EIM) occur in nearly 40% of patients with IBD and impact both disease experience and therapeutic decision-making, but are not well captured by administrative codes. We aimed to pilot computational natural language processing (NLP) methods to characterise EIMs using consultant notes. Methods Subjects with a diagnosis of IBD were identified in a single-centre retrospective review of electronic health records (EHR) between 2014–2017. Gastroenterology (GI) notes were annotated by two reviewers for the presence and activity of EIMs. EIM concepts were identified using NLP methods leveraging UMLS libraries and hand-crafted features. EIM characterisation occurred within a ±25-word window around identified EIMs with classifications including inactive concepts (negated, historical, resolved) and active concepts (improved, worsened, active but unchanged). Decisions on EIM status when repeatedly referenced in a document used section-based weighting for status inference, with greatest to least weight ranking for assessment/plan, subjective, past history, exam, and other, respectively. EIM status was classified as ambiguous when multiple conflicting references were present within the same document of approximately equal weight. Model development and testing used an 80/20 dataset split. Results In 4108 unique IBD patients, 1640 (39.9%) had at least 1 EIM identified. The mean age was 41.9 years, 47.2% were male, and 27.0% had biologic exposure. A total of 1240 manually annotated documents (first GI notes) were comprised of 51.1% arthritis, 16.5% ocular, 16.2% psoriasis, with erythema nodosum (EN), pyoderma gangrenosum (PG), and hidradenitis suppurativa (HS) together comprising 16.2% of the cohort. NLP models performed well for correctly classifying both EIM presence and status in a testing set, with overall accuracy, sensitivity, and specificity of 91.2%, 92.9% and 81.8% across all EIMs in notes automatically classified as non-ambiguous (Table 1). NLP methods identified EIM status classification as ambiguous in 38.9% of cases. Conclusion NLP methods can detect and classify EIMs with reasonable performance and efficiency compared with traditional manual chart review. Though source document variation and ambiguity present challenges, NLP offers exciting possibilities for population-based research and decision support.

Download Full-text

Increased Risk of a Cancer Diagnosis after Herpes Zoster Ophthalmicus: A Nationwide Population-Based Study

Ophthalmology ◽

10.1016/j.ophtha.2010.10.008 ◽

2011 ◽

Vol 118 (6) ◽

pp. 1076-1081 ◽

Cited By ~ 22

Author(s):

Jau-Der Ho ◽

Sudha Xirasagar ◽

Herng-Ching Lin

Keyword(s):

Herpes Zoster ◽

Cancer Diagnosis ◽

Population Based ◽

Herpes Zoster Ophthalmicus ◽

Population Based Study ◽

Increased Risk

Download Full-text

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) Using Natural Language Processing

10.1101/2021.05.05.21256555 ◽

2021 ◽

Author(s):

Chengyi Zheng ◽

Jonathan Duffy ◽

In-Lu Amy Liu ◽

Lina S. Sy ◽

Ronald A. Navarro ◽

...

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Population Based ◽

Care Organization ◽

Reference Standard ◽

Shoulder Injury ◽

Vaccine Administration

Background: Shoulder injury related to vaccine administration (SIRVA) accounts for more than half of all claims received by the National Vaccine Injury Compensation Program. However, there is a lack of population-based studies due to the challenge of identifying SIRVA cases in large health care databases. Objective: To develop a natural language processing (NLP) method to identify SIRVA cases from clinical notes. Methods: We conducted the study among members of a large integrated health care organization who were vaccinated between 04/1/2016 and 12/31/2017 and had subsequent diagnosis codes indicative of shoulder injury. Based on a training dataset with a chart review reference standard of 164 individuals, we developed an NLP algorithm to extract shoulder disorder information, including prior vaccination, anatomic location, temporality and causality. The algorithm identified three groups of positive SIRVA cases (definite, probable and possible) based on the strength of evidence. We compared NLP results to a chart review reference standard of 100 vaccinated individuals. We then applied the final automated NLP algorithm to a broader cohort of vaccinated individuals with a shoulder injury diagnosis code and performed manual chart confirmation on a random sample of NLP-identified definite cases and all NLP-identified probable and possible cases. Results: In the validation sample, the NLP algorithm had 100% accuracy for identifying 4 SIRVA cases and 96 individuals without SIRVA. In the broader cohort of 53,585 individuals, the NLP algorithm identified 291 definite, 124 probable, and 52 possible SIRVA cases. The chart-confirmation rates for these groups were 95.3%, 67.7% and 18.9%, respectively. Conclusions: The algorithm performed with high sensitivity and reasonable specificity in identifying positive SIRVA cases. The NLP algorithm can potentially be used in future population-based studies to identify this rare adverse event, avoiding labor-intensive chart review validation.

Download Full-text

P3.07-013 Determining EGFR and ALK Status in a Population-Based Cancer Registry: A Natural Language Processing Validation Study

Journal of Thoracic Oncology ◽

10.1016/j.jtho.2016.11.2204 ◽

2017 ◽

Vol 12 (1) ◽

pp. S1438 ◽

Cited By ~ 1

Author(s):

Bernardo Goulart ◽

Emily Silgard ◽

Christina Baik ◽

Aasthaa Bansal ◽

Mikael Greenwood-Hickman ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Cancer Registry ◽

Validation Study ◽

Population Based

Download Full-text

Increased Risk of a Cancer Diagnosis after Herpes Zoster Ophthalmicus: A Nationwide Population-Based Study

Yearbook of Ophthalmology ◽

10.1016/j.yoph.2012.03.043 ◽

2012 ◽

Vol 2012 ◽

pp. 99-100

Author(s):

K.M. Hammersmith

Keyword(s):

Herpes Zoster ◽

Cancer Diagnosis ◽

Population Based ◽

Herpes Zoster Ophthalmicus ◽

Population Based Study ◽

Increased Risk

Download Full-text

A Case-Based Retrieval System Using Natural Language Processing and Population-Based Visualization

2011 IEEE First International Conference on Healthcare Informatics, Imaging and Systems Biology ◽

10.1109/hisb.2011.3 ◽

2011 ◽

Cited By ~ 1

Author(s):

William Hsu ◽

Ricky K. Taira ◽

Fernando Vinuela ◽

Alex A.T. Bui

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Retrieval System ◽

Population Based ◽

Case Based

Download Full-text

Population-Based Analysis of Histologically Confirmed Melanocytic Proliferations Using Natural Language Processing

JAMA Dermatology ◽

10.1001/jamadermatol.2017.4060 ◽

2018 ◽

Vol 154 (1) ◽

pp. 24 ◽

Cited By ~ 20

Author(s):

Jason P. Lott ◽

Denise M. Boudreau ◽

Ray L. Barnhill ◽

Martin A. Weinstock ◽

Eleanor Knopp ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Population Based

Download Full-text

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) Using Natural Language Processing (Preprint)

10.2196/preprints.30426 ◽

2021 ◽

Author(s):

Chengyi Zheng ◽

Jonathan Duffy ◽

In-Lu Amy Liu ◽

Lina S. Sy ◽

Ronald A. Navarro ◽

...

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Population Based ◽

Care Organization ◽

Reference Standard ◽

Shoulder Injury ◽

Vaccine Administration

BACKGROUND Shoulder injury related to vaccine administration (SIRVA) accounts for more than half of all claims received by the National Vaccine Injury Compensation Program. However, there is a lack of population-based studies due to the challenge of identifying SIRVA cases in large health care databases. OBJECTIVE To develop a natural language processing (NLP) method to identify SIRVA cases from clinical notes. METHODS We conducted the study among members of a large integrated health care organization who were vaccinated between 04/1/2016 and 12/31/2017 and had subsequent diagnosis codes indicative of shoulder injury. Based on a training dataset with a chart review reference standard of 164 individuals, we developed an NLP algorithm to extract shoulder disorder information, including prior vaccination, anatomic location, temporality and causality. The algorithm identified three groups of positive SIRVA cases (definite, probable and possible) based on the strength of evidence. We compared NLP results to a chart review reference standard of 100 vaccinated individuals. We then applied the final automated NLP algorithm to a broader cohort of vaccinated individuals with a shoulder injury diagnosis code and performed manual chart confirmation on a random sample of NLP-identified definite cases and all NLP-identified probable and possible cases. RESULTS In the validation sample, the NLP algorithm had 100% accuracy for identifying 4 SIRVA cases and 96 individuals without SIRVA. In the broader cohort of 53,585 individuals, the NLP algorithm identified 291 definite, 124 probable, and 52 possible SIRVA cases. The chart-confirmation rates for these groups were 95.3%, 67.7% and 18.9%, respectively. CONCLUSIONS The algorithm performed with high sensitivity and reasonable specificity in identifying positive SIRVA cases. The NLP algorithm can potentially be used in future population-based studies to identify this rare adverse event, avoiding labor-intensive chart review validation.

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text