scholarly journals Can we trust the prediction model? Demonstrating the importance of external validation by investigating the COVID-19 Vulnerability (C-19) Index across an international network of observational healthcare datasets (Preprint)

2020 ◽  
Author(s):  
Jenna M Reps ◽  
Chungsoo Kim ◽  
Ross D Williams ◽  
Aniek F. Markus ◽  
Cynthia Yang ◽  
...  

BACKGROUND SARS-CoV-2 is straining healthcare systems globally. The burden on hospitals during the pandemic could be reduced by implementing prediction models that can discriminate between patients requiring hospitalization and those who do not. The COVID-19 vulnerability (C-19) index, a model that predicts which patients will be admitted to hospital for treatment of pneumonia or pneumonia proxies, has been developed and proposed as a valuable tool for decision making during the pandemic. However, the model is at high risk of bias according to the Prediction model Risk Of Bias ASsessment Tool and has not been externally validated. OBJECTIVE Externally validate the C-19 index across a range of healthcare settings to determine how well it broadly predicts hospitalization due to pneumonia in COVID-19 cases METHODS We followed the OHDSI framework for external validation to assess the reliability of the C-19 model. We evaluated the model on two different target populations: i) 41,381 patients that have SARS-CoV-2 at an outpatient or emergency room visit and ii) 9,429,285 patients that have influenza or related symptoms during an outpatient or emergency room visit, to predict their risk of hospitalization with pneumonia during the following 0 to 30 days. In total we validated the model across a network of 14 databases spanning the US, Europe, Australia and Asia. RESULTS The internal validation performance of the C-19 index was a c-statistic of 0.73 and calibration was not reported by the authors. When we externally validated it by transporting it to SARS-CoV-2 data the model obtained c-statistics of 0.36, 0.53 (0.473-0.584) and 0.56 (0.488-0.636) on Spanish, US and South Korean datasets respectively. The calibration was poor with the model under-estimating risk. When validated on 12 datasets containing influenza patients across the OHDSI network the c-statistics ranged between 0.40-0.68. CONCLUSIONS The results show that the discriminative performance of the C-19 model is low for influenza cohorts, and even worse amongst COVID-19 patients in the US, Spain and South Korea. These results suggest that C-19 should not be used to aid decision making during the COVID-19 pandemic. Our findings highlight the importance of performing external validation across a range of settings, especially when a prediction model is being extrapolated to a different population. In the field of prediction, extensive validation is required to create appropriate trust in a model.

2020 ◽  
Author(s):  
Jenna M. Reps ◽  
Chungsoo Kim ◽  
Ross D. Williams ◽  
Aniek F. Markus ◽  
Cynthia Yang ◽  
...  

AbstractBackgroundSARS-CoV-2 is straining healthcare systems globally. The burden on hospitals during the pandemic could be reduced by implementing prediction models that can discriminate between patients requiring hospitalization and those who do not. The COVID-19 vulnerability (C-19) index, a model that predicts which patients will be admitted to hospital for treatment of pneumonia or pneumonia proxies, has been developed and proposed as a valuable tool for decision making during the pandemic. However, the model is at high risk of bias according to the Prediction model Risk Of Bias ASsessment Tool and has not been externally validated.MethodsWe followed the OHDSI framework for external validation to assess the reliability of the C-19 model. We evaluated the model on two different target populations: i) 41,381 patients that have SARS-CoV-2 at an outpatient or emergency room visit and ii) 9,429,285 patients that have influenza or related symptoms during an outpatient or emergency room visit, to predict their risk of hospitalization with pneumonia during the following 0 to 30 days. In total we validated the model across a network of 14 databases spanning the US, Europe, Australia and Asia.FindingsThe internal validation performance of the C-19 index was a c-statistic of 0.73 and calibration was not reported by the authors. When we externally validated it by transporting it to SARS-CoV-2 data the model obtained c-statistics of 0.36, 0.53 (0.473-0.584) and 0.56 (0.488-0.636) on Spanish, US and South Korean datasets respectively. The calibration was poor with the model under-estimating risk. When validated on 12 datasets containing influenza patients across the OHDSI network the c-statistics ranged between 0.40-0.68.InterpretationThe results show that the discriminative performance of the C-19 model is low for influenza cohorts, and even worse amongst COVID-19 patients in the US, Spain and South Korea. These results suggest that C-19 should not be used to aid decision making during the COVID-19 pandemic. Our findings highlight the importance of performing external validation across a range of settings, especially when a prediction model is being extrapolated to a different population. In the field of prediction, extensive validation is required to create appropriate trust in a model.


10.2196/21547 ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. e21547
Author(s):  
Jenna M Reps ◽  
Chungsoo Kim ◽  
Ross D Williams ◽  
Aniek F Markus ◽  
Cynthia Yang ◽  
...  

Background SARS-CoV-2 is straining health care systems globally. The burden on hospitals during the pandemic could be reduced by implementing prediction models that can discriminate patients who require hospitalization from those who do not. The COVID-19 vulnerability (C-19) index, a model that predicts which patients will be admitted to hospital for treatment of pneumonia or pneumonia proxies, has been developed and proposed as a valuable tool for decision-making during the pandemic. However, the model is at high risk of bias according to the “prediction model risk of bias assessment” criteria, and it has not been externally validated. Objective The aim of this study was to externally validate the C-19 index across a range of health care settings to determine how well it broadly predicts hospitalization due to pneumonia in COVID-19 cases. Methods We followed the Observational Health Data Sciences and Informatics (OHDSI) framework for external validation to assess the reliability of the C-19 index. We evaluated the model on two different target populations, 41,381 patients who presented with SARS-CoV-2 at an outpatient or emergency department visit and 9,429,285 patients who presented with influenza or related symptoms during an outpatient or emergency department visit, to predict their risk of hospitalization with pneumonia during the following 0-30 days. In total, we validated the model across a network of 14 databases spanning the United States, Europe, Australia, and Asia. Results The internal validation performance of the C-19 index had a C statistic of 0.73, and the calibration was not reported by the authors. When we externally validated it by transporting it to SARS-CoV-2 data, the model obtained C statistics of 0.36, 0.53 (0.473-0.584) and 0.56 (0.488-0.636) on Spanish, US, and South Korean data sets, respectively. The calibration was poor, with the model underestimating risk. When validated on 12 data sets containing influenza patients across the OHDSI network, the C statistics ranged between 0.40 and 0.68. Conclusions Our results show that the discriminative performance of the C-19 index model is low for influenza cohorts and even worse among patients with COVID-19 in the United States, Spain, and South Korea. These results suggest that C-19 should not be used to aid decision-making during the COVID-19 pandemic. Our findings highlight the importance of performing external validation across a range of settings, especially when a prediction model is being extrapolated to a different population. In the field of prediction, extensive validation is required to create appropriate trust in a model.


2021 ◽  
Author(s):  
Jamie L. Miller ◽  
Masafumi Tada ◽  
Michihiko Goto ◽  
Nicholas Mohr ◽  
Sangil Lee

ABSTRACTBackgroundThroughout 2020, the coronavirus disease 2019 (COVID-19) has become a threat to public health on national and global level. There has been an immediate need for research to understand the clinical signs and symptoms of COVID-19 that can help predict deterioration including mechanical ventilation, organ support, and death. Studies thus far have addressed the epidemiology of the disease, common presentations, and susceptibility to acquisition and transmission of the virus; however, an accurate prognostic model for severe manifestations of COVID-19 is still needed because of the limited healthcare resources available.ObjectiveThis systematic review aims to evaluate published reports of prediction models for severe illnesses caused COVID-19.MethodsSearches were developed by the primary author and a medical librarian using an iterative process of gathering and evaluating terms. Comprehensive strategies, including both index and keyword methods, were devised for PubMed and EMBASE. The data of confirmed COVID-19 patients from randomized control studies, cohort studies, and case-control studies published between January 2020 and July 2020 were retrieved. Studies were independently assessed for risk of bias and applicability using the Prediction Model Risk Of Bias Assessment Tool (PROBAST). We collected study type, setting, sample size, type of validation, and outcome including intubation, ventilation, any other type of organ support, or death. The combination of the prediction model, scoring system, performance of predictive models, and geographic locations were summarized.ResultsA primary review found 292 articles relevant based on title and abstract. After further review, 246 were excluded based on the defined inclusion and exclusion criteria. Forty-six articles were included in the qualitative analysis. Inter observer agreement on inclusion was 0.86 (95% confidence interval: 0.79 - 0.93). When the PROBAST tool was applied, 44 of the 46 articles were identified to have high or unclear risk of bias, or high or unclear concern for applicability. Two studied reported prediction models, 4C Mortality Score from hospital data and QCOVID from general public data from UK, and were rated as low risk of bias and low concerns for applicability.ConclusionSeveral prognostic models are reported in the literature, but many of them had concerning risks of biases and applicability. For most of the studies, caution is needed before use, as many of them will require external validation before dissemination. However, two articles were found to have low risk of bias and low applicability can be useful tools.


2021 ◽  
pp. 1-9
Author(s):  
Jeff Ehresman ◽  
Daniel Lubelski ◽  
Zach Pennington ◽  
Bethany Hung ◽  
A. Karim Ahmed ◽  
...  

OBJECTIVEThe objective of this study was to evaluate the characteristics and performance of current prediction models in the fields of spine metastasis and degenerative spine disease to create a scoring system that allows direct comparison of the prediction models.METHODSA systematic search of PubMed and Embase was performed to identify relevant studies that included either the proposal of a prediction model or an external validation of a previously proposed prediction model with 1-year outcomes. Characteristics of the original study and discriminative performance of external validations were then assigned points based on thresholds from the overall cohort.RESULTSNine prediction models were included in the spine metastasis category, while 6 prediction models were included in the degenerative spine category. After assigning the proposed utility of prediction model score to the spine metastasis prediction models, only 1 reached the grade of excellent, while 2 were graded as good, 3 as fair, and 3 as poor. Of the 6 included degenerative spine models, 1 reached the excellent grade, while 3 studies were graded as good, 1 as fair, and 1 as poor.CONCLUSIONSAs interest in utilizing predictive analytics in spine surgery increases, there is a concomitant increase in the number of published prediction models that differ in methodology and performance. Prior to applying these models to patient care, these models must be evaluated. To begin addressing this issue, the authors proposed a grading system that compares these models based on various metrics related to their original design as well as internal and external validation. Ultimately, this may hopefully aid clinicians in determining the relative validity and usability of a given model.


BMJ Open ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. e044687
Author(s):  
Lauren S. Peetluk ◽  
Felipe M. Ridolfi ◽  
Peter F. Rebeiro ◽  
Dandan Liu ◽  
Valeria C Rolla ◽  
...  

ObjectiveTo systematically review and critically evaluate prediction models developed to predict tuberculosis (TB) treatment outcomes among adults with pulmonary TB.DesignSystematic review.Data sourcesPubMed, Embase, Web of Science and Google Scholar were searched for studies published from 1 January 1995 to 9 January 2020.Study selection and data extractionStudies that developed a model to predict pulmonary TB treatment outcomes were included. Study screening, data extraction and quality assessment were conducted independently by two reviewers. Study quality was evaluated using the Prediction model Risk Of Bias Assessment Tool. Data were synthesised with narrative review and in tables and figures.Results14 739 articles were identified, 536 underwent full-text review and 33 studies presenting 37 prediction models were included. Model outcomes included death (n=16, 43%), treatment failure (n=6, 16%), default (n=6, 16%) or a composite outcome (n=9, 25%). Most models (n=30, 81%) measured discrimination (median c-statistic=0.75; IQR: 0.68–0.84), and 17 (46%) reported calibration, often the Hosmer-Lemeshow test (n=13). Nineteen (51%) models were internally validated, and six (16%) were externally validated. Eighteen (54%) studies mentioned missing data, and of those, half (n=9) used complete case analysis. The most common predictors included age, sex, extrapulmonary TB, body mass index, chest X-ray results, previous TB and HIV. Risk of bias varied across studies, but all studies had high risk of bias in their analysis.ConclusionsTB outcome prediction models are heterogeneous with disparate outcome definitions, predictors and methodology. We do not recommend applying any in clinical settings without external validation, and encourage future researchers adhere to guidelines for developing and reporting of prediction models.Trial registrationThe study was registered on the international prospective register of systematic reviews PROSPERO (CRD42020155782)


2022 ◽  
pp. 1-11
Author(s):  
Andrew S. Moriarty ◽  
Nicholas Meader ◽  
Kym I. E. Snell ◽  
Richard D. Riley ◽  
Lewis W. Paton ◽  
...  

Background Relapse and recurrence of depression are common, contributing to the overall burden of depression globally. Accurate prediction of relapse or recurrence while patients are well would allow the identification of high-risk individuals and may effectively guide the allocation of interventions to prevent relapse and recurrence. Aims To review prognostic models developed to predict the risk of relapse, recurrence, sustained remission, or recovery in adults with remitted major depressive disorder. Method We searched the Cochrane Library (current issue); Ovid MEDLINE (1946 onwards); Ovid Embase (1980 onwards); Ovid PsycINFO (1806 onwards); and Web of Science (1900 onwards) up to May 2021. We included development and external validation studies of multivariable prognostic models. We assessed risk of bias of included studies using the Prediction model risk of bias assessment tool (PROBAST). Results We identified 12 eligible prognostic model studies (11 unique prognostic models): 8 model development-only studies, 3 model development and external validation studies and 1 external validation-only study. Multiple estimates of performance measures were not available and meta-analysis was therefore not necessary. Eleven out of the 12 included studies were assessed as being at high overall risk of bias and none examined clinical utility. Conclusions Due to high risk of bias of the included studies, poor predictive performance and limited external validation of the models identified, presently available clinical prediction models for relapse and recurrence of depression are not yet sufficiently developed for deploying in clinical settings. There is a need for improved prognosis research in this clinical area and future studies should conform to best practice methodological and reporting guidelines.


2021 ◽  
Author(s):  
Esmee Venema ◽  
Benjamin S Wessler ◽  
Jessica K Paulus ◽  
Rehab Salah ◽  
Gowri Raman ◽  
...  

AbstractObjectiveTo assess whether the Prediction model Risk Of Bias ASsessment Tool (PROBAST) and a shorter version of this tool can identify clinical prediction models (CPMs) that perform poorly at external validation.Study Design and SettingWe evaluated risk of bias (ROB) on 102 CPMs from the Tufts CPM Registry, comparing PROBAST to a short form consisting of six PROBAST items anticipated to best identify high ROB. We then applied the short form to all CPMs in the Registry with at least 1 validation and assessed the change in discrimination (dAUC) between the derivation and the validation cohorts (n=1,147).ResultsPROBAST classified 98/102 CPMS as high ROB. The short form identified 96 of these 98 as high ROB (98% sensitivity), with perfect specificity. In the full CPM registry, 529/556 CPMs (95%) were classified as high ROB, 20 (4%) low ROB, and 7 (1%) unclear ROB. Median change in discrimination was significantly smaller in low ROB models (dAUC −0.9%, IQR −6.2%–4.2%) compared to high ROB models (dAUC −11.7%, IQR −33.3%–2.6%; p<0.001).ConclusionHigh ROB is pervasive among published CPMs. It is associated with poor performance at validation, supporting the application of PROBAST or a shorter version in CPM reviews.What is newHigh risk of bias is pervasive among published clinical prediction modelsHigh risk of bias identified with PROBAST is associated with poorer model performance at validationA subset of questions can distinguish between models with high and low risk of bias


2021 ◽  
Vol In Press (In Press) ◽  
Author(s):  
Samaneh Asgari ◽  
Davood Khalili ◽  
Farhad Hosseinpanah ◽  
Farzad Hadaegh

Objectives: This study aimed to provide an overview of prediction models of undiagnosed type 2 diabetes mellitus (U-T2DM) or the incident T2DM (I-T2DM) using the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist and the prediction model risk of the bias assessment tool (PROBAST). Data Sources: Both PUBMED and EMBASE databases were searched to guarantee adequate and efficient coverage. Study Selection: Articles published between December 2011 and October 2019 were considered. Data Extraction: For each article, information on model development requirements, discrimination measures, calibration, overall performance, clinical usefulness, overfitting, and risk of bias (ROB) was reported. Results: The median (interquartile range; IQR) number of the 46 study populations for model development was 5711 (1971 - 27426) and 2457 (2060 - 6995) individuals for I-T2DM and U-T2DM, respectively. The most common reported predictors were age and body mass index, and only the Qrisk-2017 study included social factors (e.g., Townsend score). Univariable analysis was reported in 46% of the studies, and the variable selection procedure was not clear in 17.4% of them. Moreover, internal and external validation was reported in 43% the studies, while over 63% of them reported calibration. The median (IQR) of AUC for I-T2DM models was 0.78 (0.74 - 0.82); the corresponding value for studies derived before October 2011 was 0.80 (0.77 - 0.83). The highest discrimination index was reported for Qrisk-2017 with C-statistics of 0.89 for women and 0.87 for men. Low ROB for I-T2DM and U-T2DM was assessed at 18% and 41%, respectively. Conclusions: Among prediction models, an intermediate to poor quality were reassessed in several aspects of model development and validation, even though there was a comprehensive protocol. Generally, despite its new risk factors or new methodological aspects, the newly developed model did not increase our capability in screening/predicting T2DM, mainly in the analysis part. It was due to the lack of external validation of the prediction models.


2020 ◽  
Author(s):  
Fernanda Gonçalves Silva ◽  
Leonardo Oliveira Pena Costa ◽  
Mark J Hancock ◽  
Gabriele Alves Palomo ◽  
Luciola da Cunha Menezes Costa ◽  
...  

Abstract Background: The prognosis of acute low back pain is generally favourable in terms of pain and disability; however, outcomes vary substantially between individual patients. Clinical prediction models help in estimating the likelihood of an outcome at a certain time point. There are existing clinical prediction models focused on prognosis for patients with low back pain. To date, there is only one previous systematic review summarising the discrimination of validated clinical prediction models to identify the prognosis in patients with low back pain of less than 3 months duration. The aim of this systematic review is to identify existing developed and/or validated clinical prediction models on prognosis of patients with low back pain of less than 3 months duration, and to summarise their performance in terms of discrimination and calibration. Methods: MEDLINE, Embase and CINAHL databases will be searched, from the inception of these databases until January 2020. Eligibility criteria will be: (1) prognostic model development studies with or without external validation, or prognostic external validation studies with or without model updating; (2) with adults aged 18 or over, with ‘recent onset’ low back pain (i.e. less than 3 months duration), with or without leg pain; (3) outcomes of pain, disability, sick leave or days absent from work or return to work status, and self-reported recovery; and (4) study with a follow-up of at least 12 weeks duration. The risk of bias of the included studies will be assessed by the Prediction model Risk Of Bias ASsessment Tool, and the overall quality of evidence will be rated using the Hierarchy of Evidence for Clinical Prediction Rules. Discussion: This systematic review will identify, appraise, and summarize evidence on the performance of existing prediction models for prognosis of low back pain, and may help clinicians to choose the best option of prediction model to better inform patients about their likely prognosis. Systematic review registration: PROSPERO reference number CRD42020160988


2021 ◽  
Vol 10 (1) ◽  
pp. 93
Author(s):  
Mahdieh Montazeri ◽  
Ali Afraz ◽  
Mitra Montazeri ◽  
Sadegh Nejatzadeh ◽  
Fatemeh Rahimi ◽  
...  

Introduction: Our aim in this study was to summarize information on the use of intelligent models for predicting and diagnosing the Coronavirus disease 2019 (COVID-19) to help early and timely diagnosis of the disease.Material and Methods: A systematic literature search included articles published until 20 April 2020 in PubMed, Web of Science, IEEE, ProQuest, Scopus, bioRxiv, and medRxiv databases. The search strategy consisted of two groups of keywords: A) Novel coronavirus, B) Machine learning. Two reviewers independently assessed original papers to determine eligibility for inclusion in this review. Studies were critically reviewed for risk of bias using prediction model risk of bias assessment tool.Results: We gathered 1650 articles through database searches. After the full-text assessment 31 articles were included. Neural networks and deep neural network variants were the most popular machine learning type. Of the five models that authors claimed were externally validated, we considered external validation only for four of them. Area under the curve (AUC) in internal validation of prognostic models varied from .94 to .97. AUC in diagnostic models varied from 0.84 to 0.99, and AUC in external validation of diagnostic models varied from 0.73 to 0.94. Our analysis finds all but two studies have a high risk of bias due to various reasons like a low number of participants and lack of external validation.Conclusion: Diagnostic and prognostic models for COVID-19 show good to excellent discriminative performance. However, these models are at high risk of bias because of various reasons like a low number of participants and lack of external validation. Future studies should address these concerns. Sharing data and experiences for the development, validation, and updating of COVID-19 related prediction models is needed. 


Sign in / Sign up

Export Citation Format

Share Document