Development of a Semi-Automated Chart Review for Assessing the Development of Radiation Pneumonitis: using Natural Language Processing (Preprint)

10.2196/preprints.29241 ◽

2021 ◽

Author(s):

Jordan McKenzie ◽

Rasika Rajapakshe ◽

Hua Shen ◽

Shan Rajapakshe ◽

Angela Lin

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Radiation Pneumonitis ◽

Efficient Manner ◽

Time Investment ◽

Curative Radiotherapy ◽

Cancer Agency ◽

Manual Review

BACKGROUND Health research frequently requires manual chart review to identify patients for the study-specific cohort and examine their clinical outcomes. Manual chart review is a labour-intensive process requiring significant time investment for clinical researchers. OBJECTIVE This study aimed to evaluate the feasibility and accuracy of an assisted chart review program, using an in-house natural language processing (NLP) program, to identify patients who developed radiation pneumonitis (RP) after receiving curative radiotherapy. METHODS A retrospective manual chart review was completed for patients who received curative radiotherapy for stage II-III lung cancer from January 1, 2013 to December 31, 2015 at BC Cancer Kelowna. In the manual chart review, RP diagnosis and grading were recorded using Common Terminology Criteria for Adverse Events (CTCAE) v5.0. From the charts of 50 sample patients, a total of 1413 clinical documents were extracted for review from the Cancer Agency Information System (CAIS). The NLP program was built using the Natural Language Toolkit Python platform. Python version 3.7.2. was used to run the NLP program. The output of the NLP program is a list of the full sentences containing the key terms, the document ID’s and dates from which these sentences were extracted. The result from the manual review was used as the gold standard in this study, with which the result of the NLP program was compared. RESULTS Twenty-five out of the 50 sample patients developed RP grade 1 or greater; the NLP program was able to ascertain 23 out of these 25 patients (sensitivity = 0.92, 95%CI:0.74-0.99; specificity = 0.36, 95%CI:0.18-0.57). Furthermore, the NLP program was able to correctly identify all 9 patients with RP grade 2 or greater, which are patients with clinically significant symptoms (sensitivity = 1.0, 95%CI: 0.66-1.0; specificity = 0.27, 95%CI:0.14-0.43). The NLP program was useful in distinguishing patients with RP from those without RP. The NLP program in this study would avoid unnecessary manual review of 22% of the sample patients (n=11), as these patients were identified as RP grade 0 and will not require further manual review in subsequent studies. CONCLUSIONS This feasibility study showed that the NLP program was able to assist with the identification of patients who developed RP after curative radiotherapy. The NLP program streamlines the manual chart review further by identifying key sentences of interest. This work has a potential to improve future clinical research, as the NLP program shows promise in performing chart review in a more time efficient manner, compared to the traditional labor-intensive manual chart review.

Download Full-text

Measuring Adoption of Patient Priorities-Aligned Care Using Natural Language Processing

Innovation in Aging ◽

10.1093/geroni/igaa057.592 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 183-183

Author(s):

Javad Razjouyan ◽

Jennifer Freytag ◽

Edward Odom ◽

Lilian Dindo ◽

Aanand Naik

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Group Analysis ◽

Intervention Group ◽

Multiple Chronic Conditions ◽

Free Text ◽

Term Care

Abstract Patient Priorities Care (PPC) is a model of care that aligns health care recommendations with priorities of older adults with multiple chronic conditions. Social workers (SW), after online training, document PPC in the patient’s electronic health record (EHR). Our goal is to identify free-text notes with PPC language using a natural language processing (NLP) model and to measure PPC adoption and effect on long term services and support (LTSS) use. Free-text notes from the EHR produced by trained SWs passed through a hybrid NLP model that utilized rule-based and statistical machine learning. NLP accuracy was validated against chart review. Patients who received PPC were propensity matched with patients not receiving PPC (control) on age, gender, BMI, Charlson comorbidity index, facility and SW. The change in LTSS utilization 6-month intervals were compared by groups with univariate analysis. Chart review indicated that 491 notes out of 689 had PPC language and the NLP model reached to precision of 0.85, a recall of 0.90, an F1 of 0.87, and an accuracy of 0.91. Within group analysis shows that intervention group used LTSS 1.8 times more in the 6 months after the encounter compared to 6 months prior. Between group analysis shows that intervention group has significant higher number of LTSS utilization (p=0.012). An automated NLP model can be used to reliably measure the adaptation of PPC by SW. PPC seems to encourage use of LTSS that may delay time to long term care placement.

Download Full-text

Natural Language Processing Performance for the Identification of Venous Thromboembolism in an Integrated Healthcare System

Clinical and Applied Thrombosis/Hemostasis ◽

10.1177/10760296211013108 ◽

2021 ◽

Vol 27 ◽

pp. 107602962110131

Author(s):

Bela Woller ◽

Austin Daw ◽

Valerie Aston ◽

Jim Lloyd ◽

Greg Snow ◽

...

Keyword(s):

Natural Language Processing ◽

Venous Thromboembolism ◽

Natural Language ◽

Medical Record ◽

Electronic Medical Record ◽

Ct Angiography ◽

Language Processing ◽

Predictive Value ◽

Chart Review ◽

Imaging Studies

Real-time identification of venous thromboembolism (VTE), defined as deep vein thrombosis (DVT) and pulmonary embolism (PE), can inform a healthcare organization’s understanding of these events and be used to improve care. In a former publication, we reported the performance of an electronic medical record (EMR) interrogation tool that employs natural language processing (NLP) of imaging studies for the diagnosis of venous thromboembolism. Because we transitioned from the legacy electronic medical record to the Cerner product, iCentra, we now report the operating characteristics of the NLP EMR interrogation tool in the new EMR environment. Two hundred randomly selected patient encounters for which the imaging report assessed by NLP that revealed VTE was present were reviewed. These included one hundred imaging studies for which PE was identified. These included computed tomography pulmonary angiography—CTPA, ventilation perfusion—V/Q scan, and CT angiography of the chest/ abdomen/pelvis. One hundred randomly selected comprehensive ultrasound (CUS) that identified DVT were also obtained. For comparison, one hundred patient encounters in which PE was suspected and imaging was negative for PE (CTPA or V/Q) and 100 cases of suspected DVT with negative CUS as reported by NLP were also selected. Manual chart review of the 400 charts was performed and we report the sensitivity, specificity, positive and negative predictive values of NLP compared with manual chart review. NLP and manual review agreed on the presence of PE in 99 of 100 cases, the presence of DVT in 96 of 100 cases, the absence of PE in 99 of 100 cases and the absence of DVT in all 100 cases. When compared with manual chart review, NLP interrogation of CUS, CTPA, CT angiography of the chest, and V/Q scan yielded a sensitivity = 93.3%, specificity = 99.6%, positive predictive value = 97.1%, and negative predictive value = 99%.

Download Full-text

SAT-LB111 Improving Classification of Diabetes Etiology in Electronic Resources Using Phenotype Algorithms and Polygenic Risk Scores

Journal of the Endocrine Society ◽

10.1210/jendso/bvaa046.2239 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

Author(s):

Lina Sulieman ◽

Jing He ◽

Robert Carroll ◽

Lisa Bastarache ◽

Andrea Ramirez

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Risk Scores ◽

P Value ◽

Learning Approaches ◽

Data Types ◽

Electronic Health

Abstract Electronic Health Records (EHR) contain rich data to identify and study diabetes. Many phenotype algorithms have been developed to identify research subjects with type 2 diabetes (T2D), but very few accurately identify type 1 diabetes (T1D) cases or more rare forms of monogenic and atypical metabolic presentations. Polygenetic risk scores (PRS) quantify risk of a disease using common genomic variants well for both T1D and T2D. In this study, we apply validated phenotyping algorithms to EHRs linked to a genomic biobank to understand the independent contribution of PRS to classification of diabetes etiology and generate additional novel markers to distinguish subtypes of diabetes in EHR data. Using a de-identified mirror of medical center’s electronic health record, we applied published algorithms for T1D and T2D to identify cases, and used natural language processing and chart review strategies to identify cases of maturity onset diabetes of the young (MODY) and other more rare presentations. This novel approach included additional data types such as medication sequencing, ratio and temporality of insulin and non-insulin agents, clinical genetic testing, and ratios of diagnostic codes. Chart review was performed to validate etiology. To calculate PRS, we used genome wide genotyping from our BioBank, the de-identified biobank linking EHR to genomic data using coefficients of 65 published T1D SNPS and 76,996 T2D SNPS using PLINK in Caucasian subjects. In the dataset, we identified 82,238 cases of T2D but only 130 cases of T1D using the most cited published algorithms. Adding novel structured elements and natural language processing identified an additional 138 cases of T1D and distinguished 354 cases as MODY. Among over 90,000 subjects with genotyping data available, we included 72,624 Caucasian subjects since PRS coefficients were generated in Caucasian cohorts. Among those subjects, 248, 6,488, and 21 subjects were identified as T1D, T2D, and MODY subjects respectively in our final PRS cohort. The T1D PRS did significantly discriminate well between cases and controls (Mann-Whitney p-value is 3.4 e-17). The PRS for T2D did not significantly discriminate between cases and controls using published algorithms. The atypical case count was too low to calculate PRS discrimination. Calculation of the PRS score was limited by quality inclusion of variants available, and discrimination may improve in larger data sets. Additionally, blinded physician case review is ongoing to validate the novel classification scheme and provide a gold standard for machine learning approaches that can be applied in validation sets.

Download Full-text

Identifying Bladder Cancer Stage And Use Of Chemotherapy In The Electronic Medical Record: How Reliable Is Natural Language Processing?

Proceedings of IMPRS ◽

10.18060/25780 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Nirupama Devanathan ◽

David Haggstrom ◽

Clint Cary

Keyword(s):

Bladder Cancer ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Clinical Stage ◽

Treatment Algorithm ◽

Clinical Staging ◽

Muscle Invasive ◽

Line Of Therapy

Background: Large automated electronic medical record (EMR) databases, together with natural language processing (NLP) algorithms, have the potential to be valuable tools in studying the patterns and effectiveness of treatment. Therefore, the current study sought to develop novel tools to identify bladder cancer cases, their clinical stage, and the chemotherapy they receive in electronic medical records. Methods: EMR data were obtained from Indiana University Health hospitals from 2008 to 2015. We developed 2 novel algorithms using natural language processing (NLP) on unstructured data to identify (a) bladder cancer cases and clinical stage, and (b) chemotherapy names and line of chemotherapy. The sensitivity, specificity, PPV, and NPV for the clinical staging and treatment algorithm were calculated against the gold standard of manual chart review Results: A total of 2,559 unique bladder cancer patients were identified and stratified using the clinical staging algorithm, defined as metastatic, muscle invasive, or non-muscle invasive. We identified 657 metastatic cases, 567 muscle invasive cases, and 604 non-muscle invasive cases. Further, we calculated the PPV for metastatic cases as 69.9%, muscle invasive as 80.4%, and non-muscle invasive as 79.1%. Next, the treatment algorithm was applied to metastatic patients to identify the type of chemotherapy received and 1st or 2nd line of therapy. The PPV for identifying the 1st and 2nd lines were 70.5% and 55.6%, respectively. The PPV for gemcitabine/carboplatin or cisplatin was 57.5%, but for methotrexate, vinblastine, doxorubicin, cisplatin, was 37.5%. Conclusion and Potential Impact: The performance of the algorithm demonstrates the potential for NLP to identify cancer cases, stage, and presence of treatment. While providing meaningful information, the accuracy of the approach suggests that a hybrid method using both NLP algorithms and manual chart review remains the most robust approach. The low performance of the algorithm to identify line of therapy further highlights the need for further NLP development in this area and emphasizes the ongoing need for either human entry or review of structured data.

Download Full-text

Automated chart review utilizing natural language processing algorithm for asthma predictive index

BMC Pulmonary Medicine ◽

10.1186/s12890-018-0593-9 ◽

2018 ◽

Vol 18 (1) ◽

Cited By ~ 12

Author(s):

Harsheen Kaur ◽

Sunghwan Sohn ◽

Chung-Il Wi ◽

Euijung Ryu ◽

Miguel A. Park ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Processing Algorithm ◽

Predictive Index ◽

Natural Language Processing Algorithm

Download Full-text

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) Using Natural Language Processing

10.1101/2021.05.05.21256555 ◽

2021 ◽

Author(s):

Chengyi Zheng ◽

Jonathan Duffy ◽

In-Lu Amy Liu ◽

Lina S. Sy ◽

Ronald A. Navarro ◽

...

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Population Based ◽

Care Organization ◽

Reference Standard ◽

Shoulder Injury ◽

Vaccine Administration

Background: Shoulder injury related to vaccine administration (SIRVA) accounts for more than half of all claims received by the National Vaccine Injury Compensation Program. However, there is a lack of population-based studies due to the challenge of identifying SIRVA cases in large health care databases. Objective: To develop a natural language processing (NLP) method to identify SIRVA cases from clinical notes. Methods: We conducted the study among members of a large integrated health care organization who were vaccinated between 04/1/2016 and 12/31/2017 and had subsequent diagnosis codes indicative of shoulder injury. Based on a training dataset with a chart review reference standard of 164 individuals, we developed an NLP algorithm to extract shoulder disorder information, including prior vaccination, anatomic location, temporality and causality. The algorithm identified three groups of positive SIRVA cases (definite, probable and possible) based on the strength of evidence. We compared NLP results to a chart review reference standard of 100 vaccinated individuals. We then applied the final automated NLP algorithm to a broader cohort of vaccinated individuals with a shoulder injury diagnosis code and performed manual chart confirmation on a random sample of NLP-identified definite cases and all NLP-identified probable and possible cases. Results: In the validation sample, the NLP algorithm had 100% accuracy for identifying 4 SIRVA cases and 96 individuals without SIRVA. In the broader cohort of 53,585 individuals, the NLP algorithm identified 291 definite, 124 probable, and 52 possible SIRVA cases. The chart-confirmation rates for these groups were 95.3%, 67.7% and 18.9%, respectively. Conclusions: The algorithm performed with high sensitivity and reasonable specificity in identifying positive SIRVA cases. The NLP algorithm can potentially be used in future population-based studies to identify this rare adverse event, avoiding labor-intensive chart review validation.

Download Full-text

Application of a Natural Language Processing Algorithm to Asthma Ascertainment. An Automated Chart Review

American Journal of Respiratory and Critical Care Medicine ◽

10.1164/rccm.201610-2006oc ◽

2017 ◽

Vol 196 (4) ◽

pp. 430-437 ◽

Cited By ~ 25

Author(s):

Chung-Il Wi ◽

Sunghwan Sohn ◽

Mary C. Rolfes ◽

Alicia Seabright ◽

Euijung Ryu ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Processing Algorithm ◽

Natural Language Processing Algorithm

Download Full-text

Women screened for breast cancer are dying from lung cancer: An opportunity to improve lung cancer screening in a mammography population

Journal of Medical Screening ◽

10.1177/09691413211013058 ◽

2021 ◽

pp. 096914132110130

Author(s):

Kim L Sandler ◽

Diane N Haddad ◽

Alexis B Paulson ◽

Travis J Osterman ◽

Carolyn C Scott ◽

...

Keyword(s):

Breast Cancer ◽

Lung Cancer ◽

Natural Language Processing ◽

Cancer Screening ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Lung Cancer Screening ◽

Smoking History ◽

Lung Screening

Objective Lung cancer is the leading cancer killer in women, resulting in more deaths than breast, cervical and ovarian cancer combined. Screening for lung cancer has been shown to significantly reduce mortality, with some evidence that women may have a greater benefit. This study demonstrates that a population of women being screened for breast cancer may greatly benefit from screening for lung cancer. Methods Data from 18,040 women who were screened for breast cancer in 2015 at two imaging facilities that also performed lung screening were reviewed. A natural language-processing algorithm followed by a manual chart review identified women eligible for lung cancer screening by U.S. Preventive Services Task Force (USPSTF) criteria. A chart review of these eligible women was performed to determine subsequent enrollment in a lung screening program (2016–2019), current screening eligibility, cancer diagnoses and cancer-related outcomes. Results Natural language processing identified 685 women undergoing screening mammography who were also potentially eligible for lung screening based on age and smoking history. Manual chart review confirmed 251 were eligible under USPSTF criteria. By June 2019, 63 (25%) had enrolled in lung screening, of which three were diagnosed with screening-detected lung cancer resulting in zero deaths. Of 188 not screened, seven were diagnosed with lung cancer resulting in five deaths by study end. Four women received a diagnosis of breast cancer with no deaths. Conclusion Women screened for breast cancer are dying from lung cancer. We must capitalize on reducing barriers to improve screening for lung cancer among high-risk women.

Download Full-text

117. Natural Language Processing: An Automated Alternative to Determining Inappropriate Group A Streptococcal Testing

Open Forum Infectious Diseases ◽

10.1093/ofid/ofaa439.162 ◽

2020 ◽

Vol 7 (Supplement_1) ◽

pp. S72-S72

Author(s):

Brian R Lee ◽

Alaina Linafelter ◽

Alaina Burns ◽

Allison Burris ◽

Heather Jones ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sore Throat ◽

Language Processing ◽

Gold Standard ◽

Chart Review ◽

Clinical Symptoms ◽

Group A ◽

Sensitivity Specificity

Abstract Background Acute pharyngitis is one of the most common causes of pediatric health care visits, accounting for approximately 12 million ambulatory care visits each year. Rapid antigen detection tests (RADTs) for Group A Streptococcus (GAS) are one of the most commonly ordered tests in the ambulatory settings. Approximately 40–60% of RADTs are estimated to be inappropriate. Determination of inappropriate RADT frequently requires time-intensive chart reviews. The purpose of this study was to determine if natural language processing (NLP) can provide an accurate and automated alternative for assessing RADT inappropriateness. Methods Patients ≥ 3 years of age who received an RADT while evaluated in our EDs/UCCs between April 2018 and September 2018 were identified. A manual chart review was completed on a 10% random sample to determine the presence of sore throat or viral symptoms (i.e., conjunctivitis, rhinorrhea, cough, diarrhea, hoarse voice, and viral exanthema). Inappropriate RADT was defined as either absence of sore throat or reporting 2 or more viral symptoms. An NLP algorithm was developed independently to assign the presence/absence of symptoms and RADT inappropriateness. The NLP sensitivity/specificity was calculated using the manual chart review sample as the gold standard. Results Manual chart review was completed on 720 patients, of which 320 (44.4%) were considered to have an inappropriate RADT. When compared to the manual review, the NLP approach showed high sensitivity (se) and specificity (sp) when assigning inappropriateness (88.4% and 90.0%, respectively). Optimal sensitivity/specificity was also observed for select symptoms, including sore throat (se: 92.9%, sp: 92.5%), cough (se: 94.5%, sp: 96.5%), and rhinorrhea (se: 86.1%, sp: 95.3%). The prevalence of clinical symptoms was similar when running NLP on subsequent, independent validation sets. After validating the NLP algorithm, a long term monthly trend report was developed. Figure Inappropriate GAS RADTs Determined by NLP, June 2018-May 2020 Conclusion An NLP algorithm can accurately identify inappropriate RADT when compared to a gold standard. Manual chart review requires dozens of hours to complete. In contrast, NLP requires only a couple of minutes and offers the potential to calculate valid metrics that are easily scaled-up to help monitor comprehensive, long-term trends. Disclosures Brian R. Lee, MPH, PhD, Merck (Grant/Research Support)

Download Full-text