Comparison of Natural Language Processing and Manual Coding for the Identification of Cross-Sectional Imaging Reports Suspicious for Lung Cancer

JCO Clinical Cancer Informatics ◽

10.1200/cci.17.00069 ◽

2018 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

Roxanne Wadia ◽

Kathleen Akgun ◽

Cynthia Brandt ◽

Brenda T. Fenton ◽

Woody Levin ◽

...

Keyword(s):

Lung Cancer ◽

Natural Language Processing ◽

Natural Language ◽

Negative Predictive Value ◽

Language Processing ◽

Predictive Value ◽

Cross Sectional ◽

Predictive Values ◽

Clinical Text

Purpose To compare the accuracy and reliability of a natural language processing (NLP) algorithm with manual coding by radiologists, and the combination of the two methods, for the identification of patients whose computed tomography (CT) reports raised the concern for lung cancer. Methods An NLP algorithm was developed using Clinical Text Analysis and Knowledge Extraction System (cTAKES) with the Yale cTAKES Extensions and trained to differentiate between language indicating benign lesions and lesions concerning for lung cancer. A random sample of 450 chest CT reports performed at Veterans Affairs Connecticut Healthcare System between January 2014 and July 2015 was selected. A reference standard was created by the manual review of reports to determine if the text stated that follow-up was needed for concern for cancer. The NLP algorithm was applied to all reports and compared with case identification using the manual coding by the radiologists. Results A total of 450 reports representing 428 patients were analyzed. NLP had higher sensitivity and lower specificity than manual coding (77.3% v 51.5% and 72.5% v 82.5%, respectively). NLP and manual coding had similar positive predictive values (88.4% v 88.9%), and NLP had a higher negative predictive value than manual coding (54% v 38.5%). When NLP and manual coding were combined, sensitivity increased to 92.3%, with a decrease in specificity to 62.85%. Combined NLP and manual coding had a positive predictive value of 87.0% and a negative predictive value of 75.2%. Conclusion Our NLP algorithm was more sensitive than manual coding of CT chest reports for the identification of patients who required follow-up for suspicion of lung cancer. The combination of NLP and manual coding is a sensitive way to identify patients who need further workup for lung cancer.

Download Full-text

Natural Language Processing Systems for Diagnosing and Determining Level of Lung Cancer: A Systematic Review

Frontiers in Health Informatics ◽

10.30699/fhi.v10i1.264 ◽

2021 ◽

Vol 10 (1) ◽

pp. 68

Author(s):

Mahdieh Montazeri ◽

Ali Afraz ◽

Raheleh Mahboob Farimani ◽

Fahimeh Ghasemian

Keyword(s):

Lung Cancer ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Lung Cancer Patients ◽

The Difference ◽

Combination Of Methods ◽

Manual Extraction

Introduction: Lung cancer is the second most common cancer for men and women. Using natural language processing to automatically extract information from text, lead to decrease labor of manual extraction from large volume of text material and save time. The aim of this study is to systematically review of studies which reviewed NLP methods in diagnosing and staging lung cancer.Material and Methods: PubMed, Scopus, Web of science, Embase was searched for English language articles that reported diagnosing and staging methods in lung cancer Using NLP until DEC 2019. Two reviewers independently assessed original papers to determine eligibility for inclusion in the review.Results: Of 119 studies, 7 studies were included. Three studies developed a NLP algorithm to scan radiology notes and determine the presence or absence of nodules to identify patients with incident lung nodules for treatment or follow-up. Two studies used NLP to transform the report text, including identification of UMLS terms and detection of negated findings to classifying reports, also one of them used an SVM-based text classification system for staging lung cancer patients. All studies reported various performance measures based on the difference between combination of methods. Most of studies have reported sensitivity and specificity of the NLP algorithm for identifying the presence of lung nodules.Conclusion: Evaluation of studies in diagnosing and staging methods in lung cancer using NLP shows there is a number of studies on diagnosing lung cancer but there are a few works on staging that. In some studies, combination of methods was considered and NLP isolated was not sufficient for capturing satisfying results. There are potentials to improve studies by adding other data sources, further refinement and subsequent validation.

Download Full-text

A deep database of medical abbreviations and acronyms for natural language processing

Scientific Data ◽

10.1038/s41597-021-00929-4 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Lisa Grossman Liu ◽

Raymond H. Grossman ◽

Elliot G. Mitchell ◽

Chunhua Weng ◽

Karthik Natarajan ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

American English ◽

Substantial Improvement ◽

Future Application ◽

Multiple Sources ◽

High Coverage ◽

Clinical Text ◽

Automated Quality Control

AbstractThe recognition, disambiguation, and expansion of medical abbreviations and acronyms is of upmost importance to prevent medically-dangerous misinterpretation in natural language processing. To support recognition, disambiguation, and expansion, we present the Medical Abbreviation and Acronym Meta-Inventory, a deep database of medical abbreviations. A systematic harmonization of eight source inventories across multiple healthcare specialties and settings identified 104,057 abbreviations with 170,426 corresponding senses. Automated cross-mapping of synonymous records using state-of-the-art machine learning reduced redundancy, which simplifies future application. Additional features include semi-automated quality control to remove errors. The Meta-Inventory demonstrated high completeness or coverage of abbreviations and senses in new clinical text, a substantial improvement over the next largest repository (6–14% increase in abbreviation coverage; 28–52% increase in sense coverage). To our knowledge, the Meta-Inventory is the most complete compilation of medical abbreviations and acronyms in American English to-date. The multiple sources and high coverage support application in varied specialties and settings. This allows for cross-institutional natural language processing, which previous inventories did not support. The Meta-Inventory is available at https://bit.ly/github-clinical-abbreviations.

Download Full-text

Natural Language Processing Performance for the Identification of Venous Thromboembolism in an Integrated Healthcare System

Clinical and Applied Thrombosis/Hemostasis ◽

10.1177/10760296211013108 ◽

2021 ◽

Vol 27 ◽

pp. 107602962110131

Author(s):

Bela Woller ◽

Austin Daw ◽

Valerie Aston ◽

Jim Lloyd ◽

Greg Snow ◽

...

Keyword(s):

Natural Language Processing ◽

Venous Thromboembolism ◽

Natural Language ◽

Medical Record ◽

Electronic Medical Record ◽

Ct Angiography ◽

Language Processing ◽

Predictive Value ◽

Chart Review ◽

Imaging Studies

Real-time identification of venous thromboembolism (VTE), defined as deep vein thrombosis (DVT) and pulmonary embolism (PE), can inform a healthcare organization’s understanding of these events and be used to improve care. In a former publication, we reported the performance of an electronic medical record (EMR) interrogation tool that employs natural language processing (NLP) of imaging studies for the diagnosis of venous thromboembolism. Because we transitioned from the legacy electronic medical record to the Cerner product, iCentra, we now report the operating characteristics of the NLP EMR interrogation tool in the new EMR environment. Two hundred randomly selected patient encounters for which the imaging report assessed by NLP that revealed VTE was present were reviewed. These included one hundred imaging studies for which PE was identified. These included computed tomography pulmonary angiography—CTPA, ventilation perfusion—V/Q scan, and CT angiography of the chest/ abdomen/pelvis. One hundred randomly selected comprehensive ultrasound (CUS) that identified DVT were also obtained. For comparison, one hundred patient encounters in which PE was suspected and imaging was negative for PE (CTPA or V/Q) and 100 cases of suspected DVT with negative CUS as reported by NLP were also selected. Manual chart review of the 400 charts was performed and we report the sensitivity, specificity, positive and negative predictive values of NLP compared with manual chart review. NLP and manual review agreed on the presence of PE in 99 of 100 cases, the presence of DVT in 96 of 100 cases, the absence of PE in 99 of 100 cases and the absence of DVT in all 100 cases. When compared with manual chart review, NLP interrogation of CUS, CTPA, CT angiography of the chest, and V/Q scan yielded a sensitivity = 93.3%, specificity = 99.6%, positive predictive value = 97.1%, and negative predictive value = 99%.

Download Full-text

Systematic review of current natural language processing methods and applications in cardiology

Heart ◽

10.1136/heartjnl-2021-319769 ◽

2021 ◽

pp. heartjnl-2021-319769

Author(s):

Meghan Reading Turchioe ◽

Alexander Volodarskiy ◽

Jyotishman Pathak ◽

Drew N Wright ◽

James Enlou Tcheng ◽

...

Keyword(s):

Systematic Review ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Care ◽

Real World Data ◽

Clinical Text ◽

Clinical Notes ◽

Artery Disease ◽

Automated Methods

Natural language processing (NLP) is a set of automated methods to organise and evaluate the information contained in unstructured clinical notes, which are a rich source of real-world data from clinical care that may be used to improve outcomes and understanding of disease in cardiology. The purpose of this systematic review is to provide an understanding of NLP, review how it has been used to date within cardiology and illustrate the opportunities that this approach provides for both research and clinical care. We systematically searched six scholarly databases (ACM Digital Library, Arxiv, Embase, IEEE Explore, PubMed and Scopus) for studies published in 2015–2020 describing the development or application of NLP methods for clinical text focused on cardiac disease. Studies not published in English, lacking a description of NLP methods, non-cardiac focused and duplicates were excluded. Two independent reviewers extracted general study information, clinical details and NLP details and appraised quality using a checklist of quality indicators for NLP studies. We identified 37 studies developing and applying NLP in heart failure, imaging, coronary artery disease, electrophysiology, general cardiology and valvular heart disease. Most studies used NLP to identify patients with a specific diagnosis and extract disease severity using rule-based NLP methods. Some used NLP algorithms to predict clinical outcomes. A major limitation is the inability to aggregate findings across studies due to vastly different NLP methods, evaluation and reporting. This review reveals numerous opportunities for future NLP work in cardiology with more diverse patient samples, cardiac diseases, datasets, methods and applications.

Download Full-text

Correlate: A PACS- and EHR-integrated Tool Leveraging Natural Language Processing to Provide Automated Clinical Follow-up

Radiographics ◽

10.1148/rg.2017160195 ◽

2017 ◽

Vol 37 (5) ◽

pp. 1451-1460 ◽

Cited By ~ 4

Author(s):

Mark D. Kovacs ◽

Joseph Mesterhazy ◽

David Avrin ◽

Thomas Urbania ◽

John Mongan

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing

Download Full-text

P2.09-29 Automatic Lung Cancer Staging from Medical Reports Using Natural Language Processing

Journal of Thoracic Oncology ◽

10.1016/j.jtho.2018.08.1326 ◽

2018 ◽

Vol 13 (10) ◽

pp. S772

Author(s):

X. Sui ◽

T. Liu ◽

Q. Huang ◽

Y. Hou ◽

Y. Wang ◽

...

Keyword(s):

Lung Cancer ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Cancer Staging ◽

Medical Reports

Download Full-text

Evaluating Specivity, Sensitivity, Positive and Negative Predictive Values of CA125 for Diagnosing Ovarian Cancer

Journal of Arak University of Medical Sciences ◽

10.32598/jams.24.2.6002.1 ◽

2021 ◽

Vol 24 (2) ◽

pp. 196-203

Author(s):

Elahe Fini ◽

◽

Neda Nasirian ◽

Bahram Hosein Beigy ◽

◽

...

Keyword(s):

Ovarian Cancer ◽

Positive Predictive Value ◽

Tumor Marker ◽

Negative Predictive Value ◽

Predictive Value ◽

Screening Test ◽

High Sensitivity ◽

Medical Community ◽

Cross Sectional ◽

Predictive Values

Background and Aim: Ovarian cancer is among the most common cancers in women worldwide. CA125 is the most frequent biomarker used in the screening for ovarian cancer. CA125 has no high sensitivity and specificity as a screening test in the medical community; however, because of being simple and noninvasive, it is almost always requested for evaluation and ruling out cancer. It plays an important role in the treatment and post-treatment process, the prediction of prognosis, and the relapse of the disease. The present study aimed to determine the relationship between a high level of CA125 tumor marker and ovarian cancer by detecting spesivity, sensivity, positive and negative predictive values. Methods & Materials: In this cross-sectional study, all cases undergoing CA125 test in Velayat Hospital in 2017-1028 were evaluated for having ovarian cancer. In addition, the CA125 level was compared between healthy individuals and patients with ovarian cancer. Finally, the obtained data were analyzed using SPSS. Ethical Considerations: The present study was approved by the Qazvin University of Medical Sciences (Ethics Code: IR.QUMS.REC.1396.316). Results: In this study, 35.3% of the study participants received a definite diagnosis of ovarian cancer. Generally, CA125 values were negative in 41.8% and positive in.58.2% of the study subjects. The sensitivity of the test was measured as 80.1%, the specivity as 53.6%, the positive predictive value equaled 48.4%, and the negative predictive value was measured as 83%. There was a significant relationship between age and the presence of ovarian cancer, and serum CA125 levels. Conclusion: The present study suggested that age and the serum level of CA125 were statistically significant. Finally, CA125 levels were significantly related to ovarian cancer. It provided moderate specivity and specivity as well as low positive predictive value and high negative predictive value as a tumor marker; it is valuable for ruling out of tumor but not appropriate as a screening test.

Download Full-text

A Strategy for Deploying Secure Cloud-Based Natural Language Processing Systems for Applied Research Involving Clinical Text

2011 44th Hawaii International Conference on System Sciences ◽

10.1109/hicss.2011.32 ◽

2011 ◽

Cited By ~ 4

Author(s):

D Carrell

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Applied Research ◽

Clinical Text

Download Full-text

Making Natural Language Processing More Accessible for Analysis of Clinical Text

SciVee ◽

10.4016/30705.01 ◽

2011 ◽

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Text

Download Full-text

Correction to: Qualitative Assessment of Adult Patients’ Perception of Atopic Dermatitis Using Natural Language Processing Analysis in a Cross-Sectional Study

Dermatology and Therapy ◽

10.1007/s13555-020-00362-2 ◽

2020 ◽

Vol 10 (2) ◽

pp. 307-310

Author(s):

Bruno Falissard ◽

Eric L. Simpson ◽

Emma Guttman-Yassky ◽

Kim A. Papp ◽

Sebastien Barbarot ◽

...

Keyword(s):

Atopic Dermatitis ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Cross Sectional Study ◽

Adult Patients ◽

Qualitative Assessment ◽

Sectional Study ◽

Cross Sectional

Download Full-text