Multi-Task Learning with Recurrent Neural Networks for ARDS Prediction using only EHR Data: Model Development and Validation Study (Preprint)

2022 ◽  
Author(s):  
Carson Lam ◽  
Rahul Thapa ◽  
Jenish Maharjan ◽  
Keyvan Rahmani ◽  
Chak Foon Tso ◽  
...  

BACKGROUND Acute Respiratory Distress Syndrome (ARDS) is a condition that is often considered to have broad and subjective diagnostic criteria and is associated with significant mortality and morbidity. Early and accurate prediction of ARDS and related conditions such as hypoxemia and sepsis could allow timely administration of therapies, leading to improved patient outcomes. OBJECTIVE To perform an exploration of how multi-label classification in the clinical setting can take advantage of the underlying dependencies between ARDS and related conditions to improve early prediction of ARDS. METHODS The electronic health record dataset included 40,073 patient encounters from 7 hospitals from 4/20/2018 to 3/17/2021. A recurrent neural network (RNN) was trained using data from 5 hospitals, and external validation was conducted on data from 2 hospitals. In addition to ARDS, 12 target labels for related conditions such as sepsis, hypoxemia and Covid-19 were used to train the model to classify a total of 13 outputs. As a comparator, XGBoost models were developed for each of the 13 target labels. Model performance was assessed using the area under the receiver operating characteristic (AUROC). Heatmaps to visualize attention scores were generated to provide interpretability to the NNs. Finally, cluster analysis was performed to identify potential phenotypic subgroups of ARDS patients. RESULTS The single RNN model trained to classify 13 outputs outperformed the XGBoost model for ARDS prediction, achieving an AUROC of 0.842 on the external test sets. Models trained on an increasing number of tasks resulted in increasing performance. Earlier diagnosis of ARDS nearly doubled the rate of in-hospital survival. Cluster analysis revealed distinct ARDS subgroups, some of which had similar mortality rates but different clinical presentations. CONCLUSIONS The RNN model presented in this paper can be used as an early warning system to stratify patients who are at risk of developing one of the multiple risk outcomes, hence providing practitioners with means to take early action.

2019 ◽  
Vol 98 (10) ◽  
pp. 1088-1095 ◽  
Author(s):  
J. Krois ◽  
C. Graetz ◽  
B. Holtfreter ◽  
P. Brinkmann ◽  
T. Kocher ◽  
...  

Prediction models learn patterns from available data (training) and are then validated on new data (testing). Prediction modeling is increasingly common in dental research. We aimed to evaluate how different model development and validation steps affect the predictive performance of tooth loss prediction models of patients with periodontitis. Two independent cohorts (627 patients, 11,651 teeth) were followed over a mean ± SD 18.2 ± 5.6 y (Kiel cohort) and 6.6 ± 2.9 y (Greifswald cohort). Tooth loss and 10 patient- and tooth-level predictors were recorded. The impact of different model development and validation steps was evaluated: 1) model complexity (logistic regression, recursive partitioning, random forest, extreme gradient boosting), 2) sample size (full data set or 10%, 25%, or 75% of cases dropped at random), 3) prediction periods (maximum 10, 15, or 20 y or uncensored), and 4) validation schemes (internal or external by centers/time). Tooth loss was generally a rare event (880 teeth were lost). All models showed limited sensitivity but high specificity. Patients’ age and tooth loss at baseline as well as probing pocket depths showed high variable importance. More complex models (random forest, extreme gradient boosting) had no consistent advantages over simpler ones (logistic regression, recursive partitioning). Internal validation (in sample) overestimated the predictive power (area under the curve up to 0.90), while external validation (out of sample) found lower areas under the curve (range 0.62 to 0.82). Reducing the sample size decreased the predictive power, particularly for more complex models. Censoring the prediction period had only limited impact. When the model was trained in one period and tested in another, model outcomes were similar to the base case, indicating temporal validation as a valid option. No model showed higher accuracy than the no-information rate. In conclusion, none of the developed models would be useful in a clinical setting, despite high accuracy. During modeling, rigorous development and external validation should be applied and reported accordingly.


2019 ◽  
Vol 20 (8) ◽  
pp. 1897 ◽  
Author(s):  
Shuaibing He ◽  
Tianyuan Ye ◽  
Ruiying Wang ◽  
Chenyang Zhang ◽  
Xuelian Zhang ◽  
...  

As one of the leading causes of drug failure in clinical trials, drug-induced liver injury (DILI) seriously impeded the development of new drugs. Assessing the DILI risk of drug candidates in advance has been considered as an effective strategy to decrease the rate of attrition in drug discovery. Recently, there have been continuous attempts in the prediction of DILI. However, it indeed remains a huge challenge to predict DILI successfully. There is an urgent need to develop a quantitative structure–activity relationship (QSAR) model for predicting DILI with satisfactory performance. In this work, we reported a high-quality QSAR model for predicting the DILI risk of xenobiotics by incorporating the use of eight effective classifiers and molecular descriptors provided by Marvin. In model development, a large-scale and diverse dataset consisting of 1254 compounds for DILI was built through a comprehensive literature retrieval. The optimal model was attained by an ensemble method, averaging the probabilities from eight classifiers, with accuracy (ACC) of 0.783, sensitivity (SE) of 0.818, specificity (SP) of 0.748, and area under the receiver operating characteristic curve (AUC) of 0.859. For further validation, three external test sets and a large negative dataset were utilized. Consequently, both the internal and external validation indicated that our model outperformed prior studies significantly. Data provided by the current study will also be a valuable source for modeling/data mining in the future.


Flooding is a major problem globally, and especially in SuratThani province, Thailand. Along the lower Tapeeriver in SuratThani, the population density is high. Implementing an early warning system can benefit people living along the banks here. In this study, our aim was to build a flood prediction model using artificial neural network (ANN), which would utilize water and stream levels along the lower Tapeeriver to predict floods. This model was used to predict flood using a dataset of rainfall and stream levels measured at local stations. The developed flood prediction model consisted of 4 input variables, namely, the rainfall amounts and stream levels at stations located in the PhraSeang district (X.37A), the Khian Sa district (X.217), and in the Phunphin district (X.5C). Model performance was evaluated using input data spanning a period of eight years (2011–2018). The model performance was compared with support vector machine (SVM), and ANN had better accuracy. The results showed an accuracy of 97.91% for the ANN model; however, for SVM it was 97.54%. Furthermore, the recall (42.78%) and f-measure (52.24%) were better for our model, however, the precision was lower. Therefore, the designed flood prediction model can estimate the likelihood of floods around the lower Tapee river region


2019 ◽  
Vol 4 (6) ◽  
pp. e001801
Author(s):  
Sarah Hanieh ◽  
Sabine Braat ◽  
Julie A Simpson ◽  
Tran Thi Thu Ha ◽  
Thach D Tran ◽  
...  

IntroductionGlobally, an estimated 151 million children under 5 years of age still suffer from the adverse effects of stunting. We sought to develop and externally validate an early life predictive model that could be applied in infancy to accurately predict risk of stunting in preschool children.MethodsWe conducted two separate prospective cohort studies in Vietnam that intensively monitored children from early pregnancy until 3 years of age. They included 1168 and 475 live-born infants for model development and validation, respectively. Logistic regression on child stunting at 3 years of age was performed for model development, and the predicted probabilities for stunting were used to evaluate the performance of this model in the validation data set.ResultsStunting prevalence was 16.9% (172 of 1015) in the development data set and 16.4% (70 of 426) in the validation data set. Key predictors included in the final model were paternal and maternal height, maternal weekly weight gain during pregnancy, infant sex, gestational age at birth, and infant weight and length at 6 months of age. The area under the receiver operating characteristic curve in the validation data set was 0.85 (95% Confidence Interval, 0.80–0.90).ConclusionThis tool applied to infants at 6 months of age provided valid prediction of risk of stunting at 3 years of age using a readily available set of parental and infant measures. Further research is required to examine the impact of preventive measures introduced at 6 months of age on those identified as being at risk of growth faltering at 3 years of age.


2021 ◽  
Vol 7 ◽  
Author(s):  
Kai Zhang ◽  
Shufang Zhang ◽  
Wei Cui ◽  
Yucai Hong ◽  
Gensheng Zhang ◽  
...  

Background: Many severity scores are widely used for clinical outcome prediction for critically ill patients in the intensive care unit (ICU). However, for patients identified by sepsis-3 criteria, none of these have been developed. This study aimed to develop and validate a risk stratification score for mortality prediction in sepsis-3 patients.Methods: In this retrospective cohort study, we employed the Medical Information Mart for Intensive Care III (MIMIC III) database for model development and the eICU database for external validation. We identified septic patients by sepsis-3 criteria on day 1 of ICU entry. The Least Absolute Shrinkage and Selection Operator (LASSO) technique was performed to select predictive variables. We also developed a sepsis mortality prediction model and associated risk stratification score. We then compared model discrimination and calibration with other traditional severity scores.Results: For model development, we enrolled a total of 5,443 patients fulfilling the sepsis-3 criteria. The 30-day mortality was 16.7%. With 5,658 septic patients in the validation set, there were 1,135 deaths (mortality 20.1%). The score had good discrimination in development and validation sets (area under curve: 0.789 and 0.765). In the validation set, the calibration slope was 0.862, and the Brier value was 0.140. In the development dataset, the score divided patients according to mortality risk of low (3.2%), moderate (12.4%), high (30.7%), and very high (68.1%). The corresponding mortality in the validation dataset was 2.8, 10.5, 21.1, and 51.2%. As shown by the decision curve analysis, the score always had a positive net benefit.Conclusion: We observed moderate discrimination and calibration for the score termed Sepsis Mortality Risk Score (SMRS), allowing stratification of patients according to mortality risk. However, we still require further modification and external validation.


2021 ◽  
Author(s):  
Edward Korot ◽  
Nikolas Pontikos ◽  
Xiaoxuan Liu ◽  
Siegfried K Wagner ◽  
Livia Faes ◽  
...  

Abstract Deep learning may transform health care, but model development has largely been dependent on availability of advanced technical expertise. Herein we present the development of a deep learning model by clinicians without coding, which predicts reported sex from retinal fundus photographs. A model was trained on 84,743 retinal fundus photos from the UK Biobank dataset. External validation was performed on 252 fundus photos from a tertiary ophthalmic referral center. For internal validation, the area under the receiver operating characteristic curve (AUROC) of the code free deep learning (CFDL) model was 0.93. Sensitivity, specificity, positive predictive value (PPV) and accuracy (ACC) were 88.8%, 83.6%, 87.3% and 86.5%, and for external validation were 83.9%, 72.2%, 78.2% and 78.6% respectively. Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task. The model performed significantly worse when foveal pathology was present in the external validation dataset, ACC: 69.4%, compared to 85.4% in healthy eyes, suggesting the fovea is a salient region for model performance OR (95% CI): 0.36 (0.19, 0.70) p = 0.0022. Automated machine learning (AutoML) may enable clinician-driven automated discovery of novel insights and disease biomarkers.


2021 ◽  
Author(s):  
Steven J. Staffa ◽  
David Zurakowski

Summary Clinical prediction models in anesthesia and surgery research have many clinical applications including preoperative risk stratification with implications for clinical utility in decision-making, resource utilization, and costs. It is imperative that predictive algorithms and multivariable models are validated in a suitable and comprehensive way in order to establish the robustness of the model in terms of accuracy, predictive ability, reliability, and generalizability. The purpose of this article is to educate anesthesia researchers at an introductory level on important statistical concepts involved with development and validation of multivariable prediction models for a binary outcome. Methods covered include assessments of discrimination and calibration through internal and external validation. An anesthesia research publication is examined to illustrate the process and presentation of multivariable prediction model development and validation for a binary outcome. Properly assessing the statistical and clinical validity of a multivariable prediction model is essential for reassuring the generalizability and reproducibility of the published tool.


Author(s):  
Khalid Bouhedjar ◽  
Abdelmalek Khorief Nacereddine ◽  
Hamida Ghorab ◽  
Abdelhafid Djerourou

The simplified molecular input line entry system (SMILES) is particularly suitable for high-speed machine processing, based on the Monte Carlo method using CORAL software. Quantitative structure-property relationships (QSPR) of critical temperatures have been established using a dataset of 165 diverse organic compounds employing hybrid optimal descriptors defined by graph and SMILES notation. External validation is one of the most important parts in the evaluation of model performance. However, previous models on the same dataset have poor predictive power in the external test set, or the authors had not done that check. In the present work, the predictive ability of model has been tested using external validation. The statistical quality of the three splits are similar and good. The r2 values for the best model are: r2 = 0.98 for the training set, r2 = 0.95 for the calibration set, and r2 = 0.94 for the validation set.


2021 ◽  
Author(s):  
Cynthia Yang ◽  
Jan A. Kors ◽  
Solomon Ioannou ◽  
Luis H. John ◽  
Aniek F. Markus ◽  
...  

Objectives This systematic review aims to provide further insights into the conduct and reporting of clinical prediction model development and validation over time. We focus on assessing the reporting of information necessary to enable external validation by other investigators. Materials and Methods We searched Embase, Medline, Web-of-Science, Cochrane Library and Google Scholar to identify studies that developed one or more multivariable prognostic prediction models using electronic health record (EHR) data published in the period 2009-2019. Results We identified 422 studies that developed a total of 579 clinical prediction models using EHR data. We observed a steep increase over the years in the number of developed models. The percentage of models externally validated in the same paper remained at around 10%. Throughout 2009-2019, for both the target population and the outcome definitions, code lists were provided for less than 20% of the models. For about half of the models that were developed using regression analysis, the final model was not completely presented. Discussion Overall, we observed limited improvement over time in the conduct and reporting of clinical prediction model development and validation. In particular, the prediction problem definition was often not clearly reported, and the final model was often not completely presented. Conclusion Improvement in the reporting of information necessary to enable external validation by other investigators is still urgently needed to increase clinical adoption of developed models.


Author(s):  
Isabelle Kaiser ◽  
Annette B. Pfahlberg ◽  
Wolfgang Uter ◽  
Markus V. Heppt ◽  
Marit B. Veierød ◽  
...  

The rising incidence of cutaneous melanoma over the past few decades has prompted substantial efforts to develop risk prediction models identifying people at high risk of developing melanoma to facilitate targeted screening programs. We review these models, regarding study characteristics, differences in risk factor selection and assessment, evaluation, and validation methods. Our systematic literature search revealed 40 studies comprising 46 different risk prediction models eligible for the review. Altogether, 35 different risk factors were part of the models with nevi being the most common one (n = 35, 78%); little consistency in other risk factors was observed. Results of an internal validation were reported for less than half of the studies (n = 18, 45%), and only 6 performed external validation. In terms of model performance, 29 studies assessed the discriminative ability of their models; other performance measures, e.g., regarding calibration or clinical usefulness, were rarely reported. Due to the substantial heterogeneity in risk factor selection and assessment as well as methodologic aspects of model development, direct comparisons between models are hardly possible. Uniform methodologic standards for the development and validation of risk prediction models for melanoma and reporting standards for the accompanying publications are necessary and need to be obligatory for that reason.


Sign in / Sign up

Export Citation Format

Share Document