A Hybrid Model for Family History Information Identification and Relation Extraction (Preprint)

A Hybrid Model for Entity Identification and Relation Extraction of Family History Information (Preprint)

10.2196/preprints.22797 ◽

2020 ◽

Author(s):

Youngjun Kim ◽

Paul M Heider ◽

Isabel R H Lally ◽

Stéphane M Meystre

Keyword(s):

Family History ◽

Information Extraction ◽

Relation Extraction ◽

Free Text ◽

Data Sets ◽

Family History Information ◽

Clinical Notes ◽

History Information ◽

End To End ◽

Entity Identification

BACKGROUND Family history information is important to assess the risk of inherited medical conditions. Natural language processing has the potential to extract this information from unstructured free-text notes to improve patient care and decision-making. We describe the end-to-end information extraction system the Medical University of South Carolina team developed when participating in the 2019 n2c2/OHNLP shared task. OBJECTIVE This task involves identifying mentions of family members and observations in electronic health record text notes, and recognizing the relations between family members, observations, and living status. Our system aims to achieve a high level of performance by integrating heuristics and advanced information extraction methods. Our efforts also include improving the performance of two subtasks by exploiting additional labeled data and clinical text-based embedding models. METHODS We present a hybrid method that combines machine learning and rule-based approaches. We implemented an end-to-end system with multiple information extraction and attribute classification components. For entity identification, we trained bidirectional long short-term memory deep learning models. These models incorporated static word embeddings and context-dependent embeddings. We created a voting ensemble that combined the predictions of all individual models. For relation extraction, we trained two relation extraction models. The first model determined the living status of each family member. The second model identified observations associated with each family member. We implemented online gradient descent models to extract related entity pairs. As part of post-challenge efforts, we used the BioCreative/OHNLP 2018 corpus and trained new models with the union of these two data sets. We also pre-trained language models using clinical notes from the MIMIC-III clinical database. RESULTS The voting ensemble achieved better performance than individual classifiers. In the entity identification task, the best performing system reached a precision of 78.90% and a recall of 83.84%. Our NLP system for entity identification and relation extraction ranked 3rd and 4th respectively in the challenge. Our end-to-end pipeline system substantially benefited from the combination of the two data sets. Compared to our official submission, the revised system yielded significantly better performance (p < 0.05) with F1-scores of 86.02% and 72.48% for entity identification and relation extraction, respectively. CONCLUSIONS We demonstrated that a hybrid model could be used to successfully extract family history information recorded in unstructured free-text notes. In this study, our approach of entity identification as a sequence labeling problem produced satisfactory results. Our post-challenge efforts significantly improved performance by leveraging additional labeled data and using word vector representations learned from large collections of clinical notes.

Download Full-text

Family history information in biomedical research

Journal of Continuing Education in the Health Professions ◽

10.1002/chp.1340210405 ◽

2001 ◽

Vol 21 (4) ◽

pp. 215-223 ◽

Cited By ~ 6

Author(s):

Kenneth S. Kendler

Keyword(s):

Family History ◽

Biomedical Research ◽

Family History Information ◽

History Information

Download Full-text

Use of Family History Information for Neural Tube Defect Prevention

American Journal of Health Education ◽

10.1080/19325037.2011.10599200 ◽

2011 ◽

Vol 42 (5) ◽

pp. 296-308

Author(s):

Ridgely Fisk Green ◽

Joan Ehrhardt ◽

Margaret F. Ruttenber ◽

Richard S. Olney

Keyword(s):

Family History ◽

Neural Tube ◽

Neural Tube Defect ◽

Family History Information ◽

History Information ◽

Defect Prevention

Download Full-text

Assessment of Family History Information in Case-Control Cancer Studies

American Journal of Epidemiology ◽

10.1093/oxfordjournals.aje.a115954 ◽

1991 ◽

Vol 133 (8) ◽

pp. 757-765 ◽

Cited By ~ 15

Author(s):

Pamela H. Phillips ◽

Martha S. Linet ◽

Emily L. Harris

Keyword(s):

Family History ◽

Case Control ◽

Family History Information ◽

History Information ◽

Cancer Studies

Download Full-text

Correction: Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

JMIR Medical Informatics ◽

10.2196/30153 ◽

2021 ◽

Vol 9 (5) ◽

pp. e30153

Author(s):

Maciej Rybinski ◽

Xiang Dai ◽

Sonit Singh ◽

Sarvnaz Karimi ◽

Anthony Nguyen

Keyword(s):

Natural Language Processing ◽

Family History ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Family History Information ◽

Health Records ◽

History Information ◽

Electronic Health

Download Full-text

From the NIH: Family history information enables physicians to recognize genetically 'at risk' patients

JAMA ◽

10.1001/jama.243.1.19 ◽

1980 ◽

Vol 243 (1) ◽

pp. 19-20 ◽

Cited By ~ 1

Keyword(s):

At Risk ◽

Family History ◽

Family History Information ◽

History Information ◽

Risk Patients

Download Full-text

Identification and Referral of Families at High Risk for Cancer Susceptibility

Journal of Clinical Oncology ◽

10.1200/jco.2002.20.2.528 ◽

2002 ◽

Vol 20 (2) ◽

pp. 528-537 ◽

Cited By ~ 54

Author(s):

Kevin M. Sweet ◽

Terry L. Bradley ◽

Judith A. Westman

Keyword(s):

Risk Assessment ◽

Family History ◽

High Risk ◽

Computer Program ◽

Medical Record ◽

Comprehensive Cancer Center ◽

Family History Information ◽

Cancer History ◽

History Information ◽

Family Cancer History

PURPOSE: Obtainment of family history and accurate assessment is essential for the identification of families at risk for hereditary cancer. Our study compared the extent to which the family cancer history in the physician medical record reflected that entered by patients directly into a touch-screen family history computer program. PATIENTS AND METHODS: The study cohort consisted of 362 patients seen at a comprehensive cancer center ambulatory clinic over a 1-year period who voluntarily used the computer program and were a mixture of new and return patients. The computer entry was assessed by genetics staff and then compared with the medical record for corroboration of family history information and appropriate physician risk assessment. RESULTS: Family history information from the medical record was available for comparison to the computer entry in 69%. It was most often completed on new patients only and not routinely updated. Of the 362 computer entries, 101 were assigned to a high-risk category. Evidence in the records confirmed 69 high-risk individuals. Documentation of physician risk assessment (ie, notation of significant family cancer history or hereditary risk) was found in only 14 of the high-risk charts. Only seven high-risk individuals (6.9%) had evidence of referral for genetic consultation. CONCLUSION: This study demonstrates the need to collect family history information on all new and established patients in order to perform adequate cancer risk assessment. The lack of identification of patients at highest risk seems to be directly correlated with insufficient data collection, risk assessment, and documentation by medical staff.

Download Full-text

Family history information extraction via deep joint learning

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0995-5 ◽

2019 ◽

Vol 19 (S10) ◽

Cited By ~ 7

Author(s):

Xue Shi ◽

Dehuan Jiang ◽

Yuanhang Huang ◽

Xiaolong Wang ◽

Qingcai Chen ◽

...

Keyword(s):

Family History ◽

Language Processing ◽

Family Members ◽

Decision Making Process ◽

Family History Information ◽

Health Records ◽

Joint Learning ◽

Clinical Text ◽

History Information ◽

Entity Identification

Abstract Background Family history (FH) information, including family members, side of family of family members (i.e., maternal or paternal), living status of family members, observations (diseases) of family members, etc., is very important in the decision-making process of disorder diagnosis and treatment. However FH information cannot be used directly by computers as it is always embedded in unstructured text in electronic health records (EHRs). In order to extract FH information form clinical text, there is a need of natural language processing (NLP). In the BioCreative/OHNLP2018 challenge, there is a task regarding FH extraction (i.e., task1), including two subtasks: (1) entity identification, identifying family members and their observations (diseases) mentioned in clinical text; (2) family history extraction, extracting side of family of family members, living status of family members, and observations of family members. For this task, we propose a system based on deep joint learning methods to extract FH information. Our system achieves the highest F1- scores of 0.8901 on subtask1 and 0.6359 on subtask2, respectively.

Download Full-text

Evaluation of family history information within clinical documents and adequacy of HL7 clinical statement and clinical genomics family history models for its representation: a case report

Journal of the American Medical Informatics Association ◽

10.1136/jamia.2009.002238 ◽

2010 ◽

Vol 17 (3) ◽

pp. 337-340 ◽

Cited By ~ 15

Author(s):

Genevieve B Melton ◽

Nandhini Raman ◽

Elizabeth S Chen ◽

Indra Neil Sarkar ◽

Serguei Pakhomov ◽

...

Keyword(s):

Case Report ◽

Family History ◽

Family History Information ◽

Clinical Genomics ◽

History Information

Download Full-text

Interinformant reliability of family history information on psychiatric disorders in relatives

European Archives of Psychiatry and Clinical Neuroscience ◽

10.1007/s004060050025 ◽

1998 ◽

Vol 248 (2) ◽

pp. 104-109 ◽

Cited By ~ 17

Author(s):

R. Heun ◽

H. M�ller

Keyword(s):

Family History ◽

Psychiatric Disorders ◽

Family History Information ◽

History Information

Download Full-text