Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine (Preprint)

Mapping Intimacies ◽

10.2196/preprints.13260 ◽

2018 ◽

Author(s):

Xiaofang Wang ◽

Yan Zhang ◽

Shiying Hao ◽

Le Zheng ◽

Jiayu Liao ◽

...

Keyword(s):

Lung Cancer ◽

At Risk ◽

High Risk ◽

Electronic Health Records ◽

Prediction Model ◽

Risk Prediction ◽

Prospective Cohort ◽

Retrospective Cohort ◽

Health Records ◽

History Of

BACKGROUND Lung cancer is the leading cause of cancer death worldwide. Early detection of individuals at risk of lung cancer is critical to reduce the mortality rate. OBJECTIVE The aim of this study was to develop and validate a prospective risk prediction model to identify patients at risk of new incident lung cancer within the next 1 year in the general population. METHODS Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. The study population consisted of patients with at least one EHR between April 1, 2016, and March 31, 2018, who had no history of lung cancer. A retrospective cohort (N=873,598) and a prospective cohort (N=836,659) were formed for model construction and validation. An Extreme Gradient Boosting (XGBoost) algorithm was adopted to build the model. It assigned a score to each individual to quantify the probability of a new incident lung cancer diagnosis from October 1, 2016, to September 31, 2017. The model was trained with the clinical profile in the retrospective cohort from the preceding 6 months and validated with the prospective cohort to predict the risk of incident lung cancer from April 1, 2017, to March 31, 2018. RESULTS The model had an area under the curve (AUC) of 0.881 (95% CI 0.873-0.889) in the prospective cohort. Two thresholds of 0.0045 and 0.01 were applied to the predictive scores to stratify the population into low-, medium-, and high-risk categories. The incidence of lung cancer in the high-risk category (579/53,922, 1.07%) was 7.7 times higher than that in the overall cohort (1167/836,659, 0.14%). Age, a history of pulmonary diseases and other chronic diseases, medications for mental disorders, and social disparities were found to be associated with new incident lung cancer. CONCLUSIONS We retrospectively developed and prospectively validated an accurate risk prediction model of new incident lung cancer occurring in the next 1 year. Through statistical learning from the statewide EHR data in the preceding 6 months, our model was able to identify statewide high-risk patients, which will benefit the population health through establishment of preventive interventions or more intensive surveillance.

Download Full-text

Personalized Risk Assessment in Never, Light, and Heavy Smokers in a prospective cohort in Taiwan

Scientific Reports ◽

10.1038/srep36482 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 8

Author(s):

Xifeng Wu ◽

Chi Pang Wen ◽

Yuanqing Ye ◽

MinKwang Tsai ◽

Christopher Wen ◽

...

Keyword(s):

Lung Cancer ◽

Family History ◽

Lung Function ◽

Risk Prediction ◽

Prospective Cohort ◽

Lung Function Test ◽

Function Test ◽

History Of ◽

Never Smokers ◽

Heavy Smokers

Abstract The objective of this study was to develop markedly improved risk prediction models for lung cancer using a prospective cohort of 395,875 participants in Taiwan. Discriminatory accuracy was measured by generation of receiver operator curves and estimation of area under the curve (AUC). In multivariate Cox regression analysis, age, gender, smoking pack-years, family history of lung cancer, personal cancer history, BMI, lung function test, and serum biomarkers such as carcinoembryonic antigen (CEA), bilirubin, alpha fetoprotein (AFP), and c-reactive protein (CRP) were identified and included in an integrative risk prediction model. The AUC in overall population was 0.851 (95% CI = 0.840–0.862), with never smokers 0.806 (95% CI = 0.790–0.819), light smokers 0.847 (95% CI = 0.824–0.871), and heavy smokers 0.732 (95% CI = 0.708–0.752). By integrating risk factors such as family history of lung cancer, CEA and AFP for light smokers, and lung function test (Maximum Mid-Expiratory Flow, MMEF25–75%), AFP and CEA for never smokers, light and never smokers with cancer risks as high as those within heavy smokers could be identified. The risk model for heavy smokers can allow us to stratify heavy smokers into subgroups with distinct risks, which, if applied to low-dose computed tomography (LDCT) screening, may greatly reduce false positives.

Download Full-text

COVID-19-Related Neuropsychiatric Symptoms in Patients With Alcohol Abuse Conditions During the SARS-CoV-2 Pandemic: A Retrospective Cohort Study Using Real World Data From Electronic Health Records of a Tertiary Hospital

Frontiers in Neurology ◽

10.3389/fneur.2021.630566 ◽

2021 ◽

Vol 12 ◽

Author(s):

Carolina Varela Rodríguez ◽

Francisco Arias Horcajadas ◽

Cristina Martín-Arriscado Arroba ◽

Carolina Combarro Ripoll ◽

Alba Juanes Gonzalez ◽

...

Keyword(s):

At Risk ◽

Cohort Study ◽

Alcohol Abuse ◽

Retrospective Cohort ◽

Neuropsychiatric Symptoms ◽

Personal History ◽

Real World Data ◽

Health Records ◽

History Of ◽

Electronic Health

Patients with an alcohol abuse disorder exhibit several medical characteristics and social determinants, which suggest a greater vulnerability to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and a worse course of the coronavirus disease 2019 (COVID-19) once infected. During the first wave of the COVID-19, most of the countries have register an increase in alcohol consumption. However, studies on the impact of alcohol addiction on the risk of COVID-19 infection are very scarce and inconclusive. This research offers a descriptive observational retrospective cohort study using real world data obtained from the Electronic Health Records. We found that patients with a personal history of alcohol abuse were 8% more likely to extend their hospitalization length of stay for 1 day (95% CI = 1.04–1.12) and 15% more likely to extend their Intensive Care Unit (ICU) length of stay (95% CI = 1.01–1.30). They were also 5.47 times more at risk of needing an ICU admission (95% CI = 1.61–18.57) and 3.54 times (95% CI = 1.51–8.30) more at risk of needing a respirator. Regarding COVID-19 symptoms, patients with a personal history of alcohol abuse were 91% more likely of exhibiting dyspnea (95% CI = 1.03–3.55) and 3.15 times more at risk of showing at least one neuropsychiatric symptom (95% CI = 1.61–6.17). In addition, they showed statistically significant differences in the number of neuropsychiatric symptoms developed during the COVID-19 infection. Therefore, we strongly recommend to warn of the negative consequences of alcohol abuse over COVID-19 complications. For this purpose. Clinicians should systematically assess history of alcohol issues and drinking habits in all patients, especially for those who seek medical advice regarding COVID-19 infection, in order to predict its severity of symptoms and potential complications. Moreover, this information should be included, in a structured field, into the Electronic Health Record to facilitate the automatic extraction of data, in real time, useful to evaluate the decision-making process in a dynamic context.

Download Full-text

Adverse Childhood Experiences (ACEs) in English Electronic Health Records of Linked Mothers and Children: Validation Study Using a Multistage Risk-Prediction Model

SSRN Electronic Journal ◽

10.2139/ssrn.3937569 ◽

2021 ◽

Author(s):

Shabeer Syed ◽

Arturo González-Izquierdo ◽

Janice Allister ◽

Gene Feder ◽

Leah Li ◽

...

Keyword(s):

Electronic Health Records ◽

Prediction Model ◽

Risk Prediction ◽

Validation Study ◽

Adverse Childhood Experiences ◽

Risk Prediction Model ◽

Health Records ◽

Childhood Experiences ◽

Adverse Childhood ◽

Mothers And Children

Download Full-text

Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning (Preprint)

10.2196/preprints.9268 ◽

2017 ◽

Author(s):

Chengyin Ye ◽

Tianyun Fu ◽

Shiying Hao ◽

Yan Zhang ◽

Oliver Wang ◽

...

Keyword(s):

Machine Learning ◽

Essential Hypertension ◽

Electronic Health Records ◽

Prediction Model ◽

Risk Prediction ◽

Risk Prediction Model ◽

Risk Category ◽

Health Records ◽

Incident Hypertension ◽

Category Score

BACKGROUND As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke. OBJECTIVE The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year. METHODS Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual. RESULTS The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension. CONCLUSIONS With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.

Download Full-text

A risk prediction model for selecting high-risk population for computed tomography lung cancer screening in china

Lung Cancer ◽

10.1016/j.lungcan.2021.11.015 ◽

2021 ◽

Author(s):

Lan-Wei Guo ◽

Zhang-Yan Lyu ◽

Qing-Cheng Meng ◽

Li-Yang Zheng ◽

Qiong Chen ◽

...

Keyword(s):

Computed Tomography ◽

Lung Cancer ◽

Cancer Screening ◽

High Risk ◽

Prediction Model ◽

Risk Prediction ◽

Lung Cancer Screening ◽

Risk Prediction Model ◽

High Risk Population ◽

Risk Population

Download Full-text

Third external replication of an individualised transdiagnostic prediction model for the automatic detection of individuals at risk of psychosis using electronic health records

Schizophrenia Research ◽

10.1016/j.schres.2021.01.005 ◽

2021 ◽

Vol 228 ◽

pp. 403-409

Author(s):

Stephen Puntis ◽

Dominic Oliver ◽

Paolo Fusar-Poli

Keyword(s):

At Risk ◽

Electronic Health Records ◽

Prediction Model ◽

Automatic Detection ◽

Health Records ◽

Electronic Health

Download Full-text

Prognostication Stereotype of Patients Morbidity and Mortality by Extraction of E-Health Records

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i6.271 ◽

2018 ◽

Vol 6 (6) ◽

pp. 215

Author(s):

Sunitha .T ◽

Shyamala .J ◽

Annie Jesus Suganthi Rani.A

Keyword(s):

At Risk ◽

Risk Prediction ◽

Health Risks ◽

Preventive Intervention ◽

Classification Problem ◽

Health Examination ◽

Disease Area ◽

Health Records ◽

Mortality And Morbidity ◽

Main Motive

Data mining suggest an innovative way of prognostication stereotype of Patients health risks. Large amount of Electronic Health Records (EHRs) collected over the years have provided a rich base for risk analysis and prediction. An EHR contains digitally stored healthcare information about an individual, such as observations, laboratory tests, diagnostic reports, medications, procedures, patient identifying information and allergies. A special type of EHR is the Health Examination Records (HER) from annual general health check-ups. Identifying participants at risk based on their current and past HERs is important for early warning and preventive intervention. By “risk”, we mean unwanted outcomes such as mortality and morbidity. This approach is limited due to the classification problem and consequently it is not informative about the specific disease area in which a personal is at risk. Limited amount of data extracted from the health record is not feasible for providing the accurate risk prediction. The main motive of this project is for risk prediction to classify progressively developing situation with the majority of the data unlabeled.

Download Full-text

Incidence, risk factors, and health service burden of sequelae of campylobacter and non-typhoidal salmonella infections in England, 2000–2015: A retrospective cohort study using linked electronic health records

Journal of Infection ◽

10.1016/j.jinf.2020.05.027 ◽

2020 ◽

Vol 81 (2) ◽

pp. 221-230 ◽

Cited By ~ 1

Author(s):

Oluwaseun B. Esan ◽

Rafael Perera ◽

Noel McCarthy ◽

Mara Violato ◽

Thomas R. Fanshawe

Keyword(s):

Risk Factors ◽

Cohort Study ◽

Health Service ◽

Electronic Health Records ◽

Retrospective Cohort Study ◽

Retrospective Cohort ◽

Health Records ◽

Salmonella Infections ◽

Incidence Risk ◽

Electronic Health

Download Full-text

HiTANet: Hierarchical Time-Aware Attention Networks for Risk Prediction on Electronic Health Records

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3394486.3403107 ◽

2020 ◽

Author(s):

Junyu Luo ◽

Muchao Ye ◽

Cao Xiao ◽

Fenglong Ma

Keyword(s):

Electronic Health Records ◽

Risk Prediction ◽

Health Records ◽

Attention Networks ◽

Electronic Health ◽

Time Aware

Download Full-text

Accuracy of the Vancouver Lung Cancer Risk Prediction Model Compared With That of Radiologists

CHEST Journal ◽

10.1016/j.chest.2019.04.002 ◽

2019 ◽

Vol 156 (1) ◽

pp. 112-119 ◽

Cited By ~ 2

Author(s):

Heber MacMahon ◽

Feng Li ◽

Yulei Jiang ◽

Samuel G. Armato

Keyword(s):

Lung Cancer ◽

Cancer Risk ◽

Prediction Model ◽

Risk Prediction ◽

Lung Cancer Risk ◽

Risk Prediction Model

Download Full-text