Relation Classification for Bleeding Events from Electronic Health Records: Exploration of Deep Learning Systems (Preprint)

Mapping Intimacies ◽

10.2196/preprints.27527 ◽

2021 ◽

Author(s):

Avijit Mitra ◽

Bhanu Pratap Singh Rawat ◽

David D McManus ◽

Hong Yu

Keyword(s):

Deep Learning ◽

Electronic Health Records ◽

Language Processing ◽

Bleeding Event ◽

Learning Systems ◽

Bleeding Events ◽

Convolutional Network ◽

Health Records ◽

Electronic Health ◽

Relation Classification

BACKGROUND Accurate detection of bleeding events from electronic health records (EHR) is crucial for identifying and characterizing different common and serious medical problems. To extract such information from EHRs, it is essential to identify the relations between bleeding events and related clinical entities (e.g., bleeding anatomic sites, lab tests). With the advent of natural language processing (NLP) and deep learning (DL) based techniques, many studies have focused on their applicability for various clinical applications. However, there has been no prior work that utilized deep learning to extract relations between bleeding events and relevant entities. OBJECTIVE In this study, we aim to evaluate multiple deep learning systems on a novel EHR dataset for bleeding event related relation classification. METHODS We first expert-annotated a new dataset of 1283 de-identified EHR notes for bleeding events and their attributes. On this dataset, we evaluated three state-of-the-art deep learning architectures, namely, convolutional neural network (CNN), graph convolutional network with attention (AGGCN) and BERT-based models (BioBERT, Bio+Clinical BERT and EhrBERT) for bleeding event relation classification task. RESULTS Our experiments show that the BERT-based models significantly outperformed CNN and AGGCN. Specifically, BioBERT achieved a macro F1 score of 0.842, outperforming both AGGCN (macro F1 score, 0.828) and CNN (macro F1 score, 0.763) by 1.4% (P<.001) and 7.9% (P<.001) respectively. CONCLUSIONS In this comprehensive study, we explored and compared different DL systems to classify relations between bleeding events and other medical concepts. On our corpus, BERT-based models outperformed other deep learning models for identifying the relations of bleeding related entities. BERT-based models were benefited from their pre-trained contextualized word representation and the use of target entity representation over traditional sequence representation.

Download Full-text

Relation Classification for Bleeding Events from Electronic Health Records: Exploration of Deep Learning Systems (Preprint)

JMIR Medical Informatics ◽

10.2196/27527 ◽

2021 ◽

Author(s):

Avijit Mitra ◽

Bhanu Pratap Singh Rawat ◽

David D McManus ◽

Hong Yu

Keyword(s):

Deep Learning ◽

Electronic Health Records ◽

Learning Systems ◽

Bleeding Events ◽

Health Records ◽

Electronic Health ◽

Relation Classification

Download Full-text

Deep learning detects and visualizes bleeding events in electronic health records

Research and Practice in Thrombosis and Haemostasis ◽

10.1002/rth2.12505 ◽

2021 ◽

Author(s):

Jannik S. Pedersen ◽

Martin S. Laursen ◽

Thiusius Rajeeth Savarimuthu ◽

Rasmus Søgaard Hansen ◽

Anne Bryde Alnor ◽

...

Keyword(s):

Deep Learning ◽

Electronic Health Records ◽

Bleeding Events ◽

Health Records ◽

Electronic Health

Download Full-text

Advancing Clinical Research Through Natural Language Processing on Electronic Health Records: Traditional Machine Learning Meets Deep Learning

Health Informatics - Clinical Research Informatics ◽

10.1007/978-3-319-98779-8_17 ◽

2019 ◽

pp. 357-378 ◽

Cited By ~ 3

Author(s):

Feifan Liu ◽

Chunhua Weng ◽

Hong Yu

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Clinical Research ◽

Language Processing ◽

Health Records ◽

Electronic Health

Download Full-text

Analysis of Electronic Health Records Based on Deep Learning with Natural Language Processing

Arabian Journal for Science and Engineering ◽

10.1007/s13369-021-05596-6 ◽

2021 ◽

Author(s):

Yi-Cheng Shen ◽

Te-Chun Hsia ◽

Ching-Hsien Hsu

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Health Records ◽

Electronic Health

Download Full-text

Classifying Social Determinants of Health from Unstructured Electronic Health Records Using Deep Learning-based Natural Language Processing

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2021.103984 ◽

2022 ◽

pp. 103984

Author(s):

Sifei Han ◽

Robert F. Zhang ◽

Lingyun Shi ◽

Russell Richie ◽

Haixia Liu ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Social Determinants Of Health ◽

Language Processing ◽

Social Determinants ◽

Determinants Of Health ◽

Health Records ◽

Electronic Health

Download Full-text

A Multi-Center Study of COVID-19 Patient Prognosis Using Deep Learning-based CT Image Analysis and Electronic Health Records

European Journal of Radiology ◽

10.1016/j.ejrad.2021.109583 ◽

2021 ◽

pp. 109583

Author(s):

Kuang Gong ◽

Dufan Wu ◽

Chiara Daniela Arru ◽

Fatemeh Homayounieh ◽

Nir Neumark ◽

...

Keyword(s):

Image Analysis ◽

Deep Learning ◽

Electronic Health Records ◽

Ct Image ◽

Health Records ◽

Electronic Health ◽

Ct Image Analysis ◽

Patient Prognosis ◽

Multi Center Study ◽

Center Study

Download Full-text

Development of algorithm for classification smoking status from unstructured bilingual electronic health records based on natural language processing (Preprint)

10.2196/preprints.26978 ◽

2021 ◽

Author(s):

Ye Seul Bae ◽

Kyung Hwan Kim ◽

Han Kyul Kim ◽

Sae Won Choi ◽

Taehoon Ko ◽

...

Keyword(s):

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Smoking Status ◽

Svm Classifier ◽

Keyword Extraction ◽

Health Records ◽

Clinical Notes ◽

Electronic Health

BACKGROUND Smoking is a major risk factor and important variable for clinical research, but there are few studies regarding automatic obtainment of smoking classification from unstructured bilingual electronic health records (EHR). OBJECTIVE We aim to develop an algorithm to classify smoking status based on unstructured EHRs using natural language processing (NLP). METHODS With acronym replacement and Python package Soynlp, we normalize 4,711 bilingual clinical notes. Each EHR notes was classified into 4 categories: current smokers, past smokers, never smokers, and unknown. Subsequently, SPPMI (Shifted Positive Point Mutual Information) is used to vectorize words in the notes. By calculating cosine similarity between these word vectors, keywords denoting the same smoking status are identified. RESULTS Compared to other keyword extraction methods (word co-occurrence-, PMI-, and NPMI-based methods), our proposed approach improves keyword extraction precision by as much as 20.0%. These extracted keywords are used in classifying 4 smoking statuses from our bilingual clinical notes. Given an identical SVM classifier, the extracted keywords improve the F1 score by as much as 1.8% compared to those of the unigram and bigram Bag of Words. CONCLUSIONS Our study shows the potential of SPPMI in classifying smoking status from bilingual, unstructured EHRs. Our current findings show how smoking information can be easily acquired and used for clinical practice and research.

Download Full-text

Abstract PO-050: Identifying de novo stage IV breast cancer (DNIV) cases in Electronic Health Records (EHR) using natural language processing

10.1158/1557-3265.adi21-po-050 ◽

2021 ◽

Author(s):

Liwei Wang ◽

Karthik Giridhar ◽

Kimberly Corbin ◽

Brenda Ernst ◽

Sadia Choudhery ◽

...

Keyword(s):

Breast Cancer ◽

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

De Novo ◽

Stage Iv ◽

Health Records ◽

Stage Iv Breast Cancer ◽

Electronic Health

Download Full-text

Deep Learning Based Temporal Information Extraction Framework on Chinese Electronic Health Records

Web Information Systems and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-02934-0_19 ◽

2018 ◽

pp. 203-214 ◽

Cited By ~ 2

Author(s):

Bing Tian ◽

Chunxiao Xing

Keyword(s):

Deep Learning ◽

Electronic Health Records ◽

Information Extraction ◽

Temporal Information ◽

Health Records ◽

Electronic Health ◽

Temporal Information Extraction

Download Full-text

Desiderata for computable representations of electronic health records-driven phenotype algorithms

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocv112 ◽

2015 ◽

Vol 22 (6) ◽

pp. 1220-1230 ◽

Cited By ~ 28

Author(s):

Huan Mo ◽

William K Thompson ◽

Luke V Rasmussen ◽

Jennifer A Pacheco ◽

Guoqian Jiang ◽

...

Keyword(s):

Electronic Health Records ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision ◽

Relational Algebra ◽

Common Data Model ◽

Health Records ◽

Electronic Health ◽

Value Sets ◽

Text Searching

Abstract Background Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, phenotype algorithms are most commonly represented as noncomputable descriptive documents and knowledge artifacts that detail the protocols for querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, and are primarily meant for human comprehension. We present desiderata for developing a computable phenotype representation model (PheRM). Methods A team of clinicians and informaticians reviewed common features for multisite phenotype algorithms published in PheKB.org and existing phenotype representation platforms. We also evaluated well-known diagnostic criteria and clinical decision-making guidelines to encompass a broader category of algorithms. Results We propose 10 desired characteristics for a flexible, computable PheRM: (1) structure clinical data into queryable forms; (2) recommend use of a common data model, but also support customization for the variability and availability of EHR data among sites; (3) support both human-readable and computable representations of phenotype algorithms; (4) implement set operations and relational algebra for modeling phenotype algorithms; (5) represent phenotype criteria with structured rules; (6) support defining temporal relations between events; (7) use standardized terminologies and ontologies, and facilitate reuse of value sets; (8) define representations for text searching and natural language processing; (9) provide interfaces for external software algorithms; and (10) maintain backward compatibility. Conclusion A computable PheRM is needed for true phenotype portability and reliability across different EHR products and healthcare systems. These desiderata are a guide to inform the establishment and evolution of EHR phenotype algorithm authoring platforms and languages.

Download Full-text