Development and validation of Prediction models for Risks of complications in Early-onset Pre-eclampsia (PREP): a prospective cohort study

Shakila Thangaratinam; John Allotey; Nadine Marlin; Ben W Mol; Peter Von Dadelszen; Wessel Ganzevoort; Joost Akkermans; Asif Ahmed; Jane Daniels; Jon Deeks; Khaled Ismail; Ann Marie Barnard; Julie Dodds; Sally Kerry; Carl Moons; Richard D Riley; Khalid S Khan

doi:10.3310/hta21180

Development and validation of Prediction models for Risks of complications in Early-onset Pre-eclampsia (PREP): a prospective cohort study

Health Technology Assessment ◽

10.3310/hta21180 ◽

2017 ◽

Vol 21 (18) ◽

pp. 1-100 ◽

Cited By ~ 11

Author(s):

Shakila Thangaratinam ◽

John Allotey ◽

Nadine Marlin ◽

Ben W Mol ◽

Peter Von Dadelszen ◽

...

Keyword(s):

Early Onset ◽

Prediction Models ◽

Antihypertensive Drugs ◽

External Validation ◽

Model Development ◽

Maternal Complications ◽

Data Set ◽

Internal Validation ◽

Development Data ◽

S Model

BackgroundThe prognosis of early-onset pre-eclampsia (before 34 weeks’ gestation) is variable. Accurate prediction of complications is required to plan appropriate management in high-risk women.ObjectiveTo develop and validate prediction models for outcomes in early-onset pre-eclampsia.DesignProspective cohort for model development, with validation in two external data sets.SettingModel development: 53 obstetric units in the UK. Model transportability: PIERS (Pre-eclampsia Integrated Estimate of RiSk for mothers) and PETRA (Pre-Eclampsia TRial Amsterdam) studies.ParticipantsPregnant women with early-onset pre-eclampsia.Sample sizeNine hundred and forty-six women in the model development data set and 850 women (634 in PIERS, 216 in PETRA) in the transportability (external validation) data sets.PredictorsThe predictors were identified from systematic reviews of tests to predict complications in pre-eclampsia and were prioritised by Delphi survey.Main outcome measuresThe primary outcome was the composite of adverse maternal outcomes established using Delphi surveys. The secondary outcome was the composite of fetal and neonatal complications.AnalysisWe developed two prediction models: a logistic regression model (PREP-L) to assess the overall risk of any maternal outcome until postnatal discharge and a survival analysis model (PREP-S) to obtain individual risk estimates at daily intervals from diagnosis until 34 weeks. Shrinkage was used to adjust for overoptimism of predictor effects. For internal validation (of the full models in the development data) and external validation (of the reduced models in the transportability data), we computed the ability of the models to discriminate between those with and without poor outcomes (c-statistic), and the agreement between predicted and observed risk (calibration slope).ResultsThe PREP-L model included maternal age, gestational age at diagnosis, medical history, systolic blood pressure, urine protein-to-creatinine ratio, platelet count, serum urea concentration, oxygen saturation, baseline treatment with antihypertensive drugs and administration of magnesium sulphate. The PREP-S model additionally included exaggerated tendon reflexes and serum alanine aminotransaminase and creatinine concentration. Both models showed good discrimination for maternal complications, with anoptimism-adjustedc-statistic of 0.82 [95% confidence interval (CI) 0.80 to 0.84] for PREP-L and 0.75 (95% CI 0.73 to 0.78) for the PREP-S model in the internal validation. External validation of the reduced PREP-L model showed good performance with ac-statistic of 0.81 (95% CI 0.77 to 0.85) in PIERS and 0.75 (95% CI 0.64 to 0.86) in PETRA cohorts for maternal complications, and calibrated well with slopes of 0.93 (95% CI 0.72 to 1.10) and 0.90 (95% CI 0.48 to 1.32), respectively. In the PIERS data set, the reduced PREP-S model had ac-statistic of 0.71 (95% CI 0.67 to 0.75) and a calibration slope of 0.67 (95% CI 0.56 to 0.79). Low gestational age at diagnosis, high urine protein-to-creatinine ratio, increased serum urea concentration, treatment with antihypertensive drugs, magnesium sulphate, abnormal uterine artery Doppler scan findings and estimated fetal weight below the 10th centile were associated with fetal complications.ConclusionsThe PREP-L model provided individualised risk estimates in early-onset pre-eclampsia to plan management of high- or low-risk individuals. The PREP-S model has the potential to be used as a triage tool for risk assessment. The impacts of the model use on outcomes need further evaluation.Trial registrationCurrent Controlled Trials ISRCTN40384046.FundingThe National Institute for Health Research Health Technology Assessment programme.

Download Full-text

The Stunting Tool for Early Prevention: development and external validation of a novel tool to predict risk of stunting in children at 3 years of age

BMJ Global Health ◽

10.1136/bmjgh-2019-001801 ◽

2019 ◽

Vol 4 (6) ◽

pp. e001801

Author(s):

Sarah Hanieh ◽

Sabine Braat ◽

Julie A Simpson ◽

Tran Thi Thu Ha ◽

Thach D Tran ◽

...

Keyword(s):

Characteristic Curve ◽

External Validation ◽

Model Development ◽

Validation Data ◽

Data Set ◽

Growth Faltering ◽

Development Data ◽

Gestational Age At Birth ◽

Development And Validation ◽

The Impact

IntroductionGlobally, an estimated 151 million children under 5 years of age still suffer from the adverse effects of stunting. We sought to develop and externally validate an early life predictive model that could be applied in infancy to accurately predict risk of stunting in preschool children.MethodsWe conducted two separate prospective cohort studies in Vietnam that intensively monitored children from early pregnancy until 3 years of age. They included 1168 and 475 live-born infants for model development and validation, respectively. Logistic regression on child stunting at 3 years of age was performed for model development, and the predicted probabilities for stunting were used to evaluate the performance of this model in the validation data set.ResultsStunting prevalence was 16.9% (172 of 1015) in the development data set and 16.4% (70 of 426) in the validation data set. Key predictors included in the final model were paternal and maternal height, maternal weekly weight gain during pregnancy, infant sex, gestational age at birth, and infant weight and length at 6 months of age. The area under the receiver operating characteristic curve in the validation data set was 0.85 (95% Confidence Interval, 0.80–0.90).ConclusionThis tool applied to infants at 6 months of age provided valid prediction of risk of stunting at 3 years of age using a readily available set of parental and infant measures. Further research is required to examine the impact of preventive measures introduced at 6 months of age on those identified as being at risk of growth faltering at 3 years of age.

Download Full-text

Development and temporal external validation of a simple risk score tool for prediction of outcomes after severe head injury based on admission characteristics from level-1 trauma centre of India using retrospectively collected data

BMJ Open ◽

10.1136/bmjopen-2020-040778 ◽

2021 ◽

Vol 11 (1) ◽

pp. e040778

Author(s):

Vineet Kumar Kamal ◽

Ravindra Mohan Pandey ◽

Deepak Agrawal

Keyword(s):

Hospital Mortality ◽

External Validation ◽

Trauma Centre ◽

Unfavourable Outcome ◽

Motor Score ◽

Validation Data ◽

Data Set ◽

Development Data ◽

Level 1 ◽

Pupillary Reactivity

ObjectiveTo develop and validate a simple risk scores chart to estimate the probability of poor outcomes in patients with severe head injury (HI).DesignRetrospective.SettingLevel-1, government-funded trauma centre, India.ParticipantsPatients with severe HI admitted to the neurosurgery intensive care unit during 19 May 2010–31 December 2011 (n=946) for the model development and further, data from same centre with same inclusion criteria from 1 January 2012 to 31 July 2012 (n=284) for the external validation of the model.Outcome(s)In-hospital mortality and unfavourable outcome at 6 months.ResultsA total of 39.5% and 70.7% had in-hospital mortality and unfavourable outcome, respectively, in the development data set. The multivariable logistic regression analysis of routinely collected admission characteristics revealed that for in-hospital mortality, age (51–60, >60 years), motor score (1, 2, 4), pupillary reactivity (none), presence of hypotension, basal cistern effaced, traumatic subarachnoid haemorrhage/intraventricular haematoma and for unfavourable outcome, age (41–50, 51–60, >60 years), motor score (1–4), pupillary reactivity (none, one), unequal limb movement, presence of hypotension were the independent predictors as its 95% confidence interval (CI) of odds ratio (OR)_did not contain one. The discriminative ability (area under the receiver operating characteristic curve (95% CI)) of the score chart for in-hospital mortality and 6 months outcome was excellent in the development data set (0.890 (0.867 to 912) and 0.894 (0.869 to 0.918), respectively), internal validation data set using bootstrap resampling method (0.889 (0.867 to 909) and 0.893 (0.867 to 0.915), respectively) and external validation data set (0.871 (0.825 to 916) and 0.887 (0.842 to 0.932), respectively). Calibration showed good agreement between observed outcome rates and predicted risks in development and external validation data set (p>0.05).ConclusionFor clinical decision making, we can use of these score charts in predicting outcomes in new patients with severe HI in India and similar settings.

Download Full-text

Development and validation of clinical prediction models for mortality, functional outcome and cognitive impairment after stroke: a study protocol

BMJ Open ◽

10.1136/bmjopen-2016-014607 ◽

2017 ◽

Vol 7 (8) ◽

pp. e014607 ◽

Cited By ~ 4

Author(s):

Marion Fahey ◽

Anthony Rudd ◽

Yannick Béjot ◽

Charles Wolfe ◽

Abdel Douiri

Keyword(s):

Cognitive Impairment ◽

Functional Outcome ◽

Multilevel Models ◽

Prediction Models ◽

External Validation ◽

Model Development ◽

College Hospital ◽

Stroke Registry ◽

Temporal Validation ◽

Stroke Register

IntroductionStroke is a leading cause of adult disability and death worldwide. The neurological impairments associated with stroke prevent patients from performing basic daily activities and have enormous impact on families and caregivers. Practical and accurate tools to assist in predicting outcome after stroke at patient level can provide significant aid for patient management. Furthermore, prediction models of this kind can be useful for clinical research, health economics, policymaking and clinical decision support.Methods2869 patients with first-ever stroke from South London Stroke Register (SLSR) (1995–2004) will be included in the development cohort. We will use information captured after baseline to construct multilevel models and a Cox proportional hazard model to predict cognitive impairment, functional outcome and mortality up to 5 years after stroke. Repeated random subsampling validation (Monte Carlo cross-validation) will be evaluated in model development. Data from participants recruited to the stroke register (2005–2014) will be used for temporal validation of the models. Data from participants recruited to the Dijon Stroke Register (1985–2015) will be used for external validation. Discrimination, calibration and clinical utility of the models will be presented.EthicsPatients, or for patients who cannot consent their relatives, gave written informed consent to participate in stroke-related studies within the SLSR. The SLSR design was approved by the ethics committees of Guy’s and St Thomas’ NHS Foundation Trust, Kings College Hospital, Queens Square and Westminster Hospitals (London). The Dijon Stroke Registry was approved by the Comité National des Registres and the InVS and has authorisation of the Commission Nationale de l’Informatique et des Libertés.

Download Full-text

P–419 On prognosis after unexplained recurrent pregnancy losses (RPL); a systematic review and external validation of clinical prediction models

Human Reproduction ◽

10.1093/humrep/deab130.418 ◽

2021 ◽

Vol 36 (Supplement_1) ◽

Author(s):

A Youssef

Keyword(s):

Systematic Review ◽

Supportive Care ◽

Prediction Model ◽

Pregnancy Outcome ◽

Prediction Models ◽

External Validation ◽

Model Development ◽

Predictive Performance ◽

Pregnancy Rates ◽

Cochrane Library

Abstract Study question Which models that predict pregnancy outcome in couples with unexplained RPL exist and what is the performance of the most used model? Summary answer We identified seven prediction models; none followed the recommended prediction model development steps. Moreover, the most used model showed poor predictive performance. What is known already RPL remains unexplained in 50–75% of couples For these couples, there is no effective treatment option and clinical management rests on supportive care. Essential part of supportive care consists of counselling on the prognosis of subsequent pregnancies. Indeed, multiple prediction models exist, however the quality and validity of these models varies. In addition, the prediction model developed by Brigham et al is the most widely used model, but has never been externally validated. Study design, size, duration We performed a systematic review to identify prediction models for pregnancy outcome after unexplained RPL. In addition we performed an external validation of the Brigham model in a retrospective cohort, consisting of 668 couples with unexplained RPL that visited our RPL clinic between 2004 and 2019. Participants/materials, setting, methods A systematic search was performed in December 2020 in Pubmed, Embase, Web of Science and Cochrane library to identify relevant studies. Eligible studies were selected and assessed according to the TRIPOD) guidelines, covering topics on model performance and validation statement. The performance of predicting live birth in the Brigham model was evaluated through calibration and discrimination, in which the observed pregnancy rates were compared to the predicted pregnancy rates. Main results and the role of chance Seven models were compared and assessed according to the TRIPOD statement. This resulted in two studies of low, three of moderate and two of above average reporting quality. These studies did not follow the recommended steps for model development and did not calculate a sample size. Furthermore, the predictive performance of neither of these models was internally- or externally validated. We performed an external validation of Brigham model. Calibration showed overestimation of the model and too extreme predictions, with a negative calibration intercept of –0.52 (CI 95% –0.68 – –0.36), with a calibration slope of 0.39 (CI 95% 0.07 – 0.71). The discriminative ability of the model was very low with a concordance statistic of 0.55 (CI 95% 0.50 – 0.59). Limitations, reasons for caution None of the studies are specifically named prediction models, therefore models may have been missed in the selection process. The external validation cohort used a retrospective design, in which only the first pregnancy after intake was registered. Follow-up time was not limited, which is important in counselling unexplained RPL couples. Wider implications of the findings: Currently, there are no suitable models that predict on pregnancy outcome after RPL. Moreover, we are in need of a model with several variables such that prognosis is individualized, and factors from both the female as the male to enable a couple specific prognosis. Trial registration number Not applicable

Download Full-text

Evaluating Modeling and Validation Strategies for Tooth Loss

Journal of Dental Research ◽

10.1177/0022034519864889 ◽

2019 ◽

Vol 98 (10) ◽

pp. 1088-1095 ◽

Cited By ~ 2

Author(s):

J. Krois ◽

C. Graetz ◽

B. Holtfreter ◽

P. Brinkmann ◽

T. Kocher ◽

...

Keyword(s):

Tooth Loss ◽

Predictive Power ◽

Prediction Models ◽

Recursive Partitioning ◽

External Validation ◽

Model Development ◽

Gradient Boosting ◽

Extreme Gradient Boosting ◽

Complex Models ◽

Development And Validation

Prediction models learn patterns from available data (training) and are then validated on new data (testing). Prediction modeling is increasingly common in dental research. We aimed to evaluate how different model development and validation steps affect the predictive performance of tooth loss prediction models of patients with periodontitis. Two independent cohorts (627 patients, 11,651 teeth) were followed over a mean ± SD 18.2 ± 5.6 y (Kiel cohort) and 6.6 ± 2.9 y (Greifswald cohort). Tooth loss and 10 patient- and tooth-level predictors were recorded. The impact of different model development and validation steps was evaluated: 1) model complexity (logistic regression, recursive partitioning, random forest, extreme gradient boosting), 2) sample size (full data set or 10%, 25%, or 75% of cases dropped at random), 3) prediction periods (maximum 10, 15, or 20 y or uncensored), and 4) validation schemes (internal or external by centers/time). Tooth loss was generally a rare event (880 teeth were lost). All models showed limited sensitivity but high specificity. Patients’ age and tooth loss at baseline as well as probing pocket depths showed high variable importance. More complex models (random forest, extreme gradient boosting) had no consistent advantages over simpler ones (logistic regression, recursive partitioning). Internal validation (in sample) overestimated the predictive power (area under the curve up to 0.90), while external validation (out of sample) found lower areas under the curve (range 0.62 to 0.82). Reducing the sample size decreased the predictive power, particularly for more complex models. Censoring the prediction period had only limited impact. When the model was trained in one period and tested in another, model outcomes were similar to the base case, indicating temporal validation as a valid option. No model showed higher accuracy than the no-information rate. In conclusion, none of the developed models would be useful in a clinical setting, despite high accuracy. During modeling, rigorous development and external validation should be applied and reported accordingly.

Download Full-text

Validation of FHWA Crash Models for Rural Intersections: Lessons Learned

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/1840-05 ◽

2003 ◽

Vol 1840 (1) ◽

pp. 41-49 ◽

Cited By ~ 75

Author(s):

Jutaek Oh ◽

Craig Lyon ◽

Simon Washington ◽

Bhagwant Persaud ◽

Joe Bared

Keyword(s):

Prediction Models ◽

External Validation ◽

National Level ◽

Lessons Learned ◽

Analysis Tool ◽

Omitted Variables ◽

Cross Sectional ◽

Internal Validation ◽

Before And After ◽

Roadway Design

A national-level safety analysis tool is needed to complement existing analytical tools for assessment of the safety impacts of roadway design alternatives. FHWA has sponsored the development of the Interactive Highway Safety Design Model (IHSDM), which is roadway design and redesign software that estimates the safety effects of alternative designs. Considering the importance of IHSDM in shaping the future of safety-related transportation investment decisions, FHWA justifiably sponsored research with the sole intent of independently validating some of the statistical models and algorithms in IHSDM. Statistical model validation aims to accomplish many important tasks, including ( a) assessment of the logical defensibility of proposed models, ( b) assessment of the transferability of models over future time periods and across different geographic locations, and ( c) identification of areas in which future model improvements should be made. These three activities are reported for five proposed types of rural intersection crash prediction models. The internal validation of the model revealed that the crash models potentially suffer from omitted variables that affect safety, site selection and countermeasure selection bias, poorly measured and surrogate variables, and misspecification of model functional forms. The external validation indicated the inability of models to perform on par with model estimation performance. Recommendations for improving the state of the practice from this research include the systematic conduct of carefully designed before-and-after studies, improvements in data standardization and collection practices, and the development of analytical methods to combine the results of before-and-after studies with cross-sectional studies in a meaningful and useful way.

Download Full-text

Prediction Models in Women with Postmenopausal Bleeding: A Systematic Review

Women s Health ◽

10.2217/whe.12.10 ◽

2012 ◽

Vol 8 (3) ◽

pp. 251-262 ◽

Cited By ~ 5

Author(s):

Nehalennia van Hanegem ◽

Maria C Breijer ◽

Brent C Opmeer ◽

Ben WJ Mol ◽

Anne Timmermans

Keyword(s):

Endometrial Cancer ◽

Impact Analysis ◽

Prediction Models ◽

External Validation ◽

Power Doppler ◽

Elevated Risk ◽

Postmenopausal Bleeding ◽

Internal Validation ◽

Phases Of Development ◽

Analysis Models

Postmenopausal bleeding is associated with an elevated risk of having endometrial cancer. The aim of this review is to give an overview of existing prediction models on endometrial cancer in women with postmenopausal bleeding. In a systematic search of the literature, we identified nine prognostic studies, of which we assessed the quality, the different phases of development and their performance. From these data, we identified the most important predictor variables. None of the detected models completed external validation or impact analysis. Models including power Doppler showed best performance in internal validation, but Doppler in general gynecological practice is not easily accessible. We can conclude that we have indications that the first step in the approach of women with postmenopausal bleeding should be to distinguish between women with low risk versus high risk of having endometrial carcinoma and the next step would be to refer patients for further (invasive) testing.

Download Full-text

External validation of EPIC’s Risk of Unplanned Readmission model, the LACE+ index and SQLape as predictors of unplanned hospital readmissions: A monocentric, retrospective, diagnostic cohort study in Switzerland

PLoS ONE ◽

10.1371/journal.pone.0258338 ◽

2021 ◽

Vol 16 (11) ◽

pp. e0258338

Author(s):

Aljoscha Benjamin Hwang ◽

Guido Schuepfer ◽

Mario Pietrini ◽

Stefan Boes

Keyword(s):

Cohort Study ◽

Quality Of Care ◽

Prediction Models ◽

Hospital Readmissions ◽

External Validation ◽

Superior Performance ◽

Study Endpoint ◽

Unplanned Readmission ◽

S Model

Introduction Readmissions after an acute care hospitalization are relatively common, costly to the health care system, and are associated with significant burden for patients. As one way to reduce costs and simultaneously improve quality of care, hospital readmissions receive increasing interest from policy makers. It is only relatively recently that strategies were developed with the specific aim of reducing unplanned readmissions using prediction models to identify patients at risk. EPIC’s Risk of Unplanned Readmission model promises superior performance. However, it has only been validated for the US setting. Therefore, the main objective of this study is to externally validate the EPIC’s Risk of Unplanned Readmission model and to compare it to the internationally, widely used LACE+ index, and the SQLAPE® tool, a Swiss national quality of care indicator. Methods A monocentric, retrospective, diagnostic cohort study was conducted. The study included inpatients, who were discharged between the 1st of January 2018 and the 31st of December 2019 from the Lucerne Cantonal Hospital, a tertiary-care provider in Central Switzerland. The study endpoint was an unplanned 30-day readmission. Models were replicated using the original intercept and beta coefficients as reported. Otherwise, score generator provided by the developers were used. For external validation, discrimination of the scores under investigation were assessed by calculating the area under the receiver operating characteristics curves (AUC). Calibration was assessed with the Hosmer-Lemeshow X2 goodness-of-fit test This report adheres to the TRIPOD statement for reporting of prediction models. Results At least 23,116 records were included. For discrimination, the EPIC´s prediction model, the LACE+ index and the SQLape® had AUCs of 0.692 (95% CI 0.676–0.708), 0.703 (95% CI 0.687–0.719) and 0.705 (95% CI 0.690–0.720). The Hosmer-Lemeshow X2 tests had values of p<0.001. Conclusion In summary, the EPIC´s model showed less favorable performance than its comparators. It may be assumed with caution that the EPIC´s model complexity has hampered its wide generalizability—model updating is warranted.

Download Full-text

Machine Learning Prediction Model for Acute Renal Failure After Acute Aortic Syndrome Surgery

Frontiers in Medicine ◽

10.3389/fmed.2021.728521 ◽

2022 ◽

Vol 8 ◽

Author(s):

Jinzhang Li ◽

Ming Gong ◽

Yashutosh Joshi ◽

Lizhong Sun ◽

Lianjun Huang ◽

...

Keyword(s):

Machine Learning ◽

Renal Failure ◽

Prediction Model ◽

Prediction Models ◽

External Validation ◽

Scoring Systems ◽

Acute Aortic Syndrome ◽

Internal Validation ◽

Medical Centers ◽

Better Than

BackgroundAcute renal failure (ARF) is the most common major complication following cardiac surgery for acute aortic syndrome (AAS) and worsens the postoperative prognosis. Our aim was to establish a machine learning prediction model for ARF occurrence in AAS patients.MethodsWe included AAS patient data from nine medical centers (n = 1,637) and analyzed the incidence of ARF and the risk factors for postoperative ARF. We used data from six medical centers to compare the performance of four machine learning models and performed internal validation to identify AAS patients who developed postoperative ARF. The area under the curve (AUC) of the receiver operating characteristic (ROC) curve was used to compare the performance of the predictive models. We compared the performance of the optimal machine learning prediction model with that of traditional prediction models. Data from three medical centers were used for external validation.ResultsThe eXtreme Gradient Boosting (XGBoost) algorithm performed best in the internal validation process (AUC = 0.82), which was better than both the logistic regression (LR) prediction model (AUC = 0.77, p < 0.001) and the traditional scoring systems. Upon external validation, the XGBoost prediction model (AUC =0.81) also performed better than both the LR prediction model (AUC = 0.75, p = 0.03) and the traditional scoring systems. We created an online application based on the XGBoost prediction model.ConclusionsWe have developed a machine learning model that has better predictive performance than traditional LR prediction models as well as other existing risk scoring systems for postoperative ARF. This model can be utilized to provide early warnings when high-risk patients are found, enabling clinicians to take prompt measures.

Download Full-text

Prediction of Masked Hypertension and Masked Uncontrolled Hypertension Using Machine Learning

Frontiers in Cardiovascular Medicine ◽

10.3389/fcvm.2021.778306 ◽

2021 ◽

Vol 8 ◽

Author(s):

Ming-Hui Hung ◽

Ling-Chieh Shih ◽

Yu-Ching Wang ◽

Hsin-Bang Leu ◽

Po-Hsun Huang ◽

...

Keyword(s):

Machine Learning ◽

Clinical Characteristics ◽

Prediction Models ◽

External Validation ◽

Uncontrolled Hypertension ◽

Gradient Boosting ◽

Masked Hypertension ◽

Internal Validation ◽

Hypertensive Patients ◽

Extreme Gradient Boosting

Objective: This study aimed to develop machine learning-based prediction models to predict masked hypertension and masked uncontrolled hypertension using the clinical characteristics of patients at a single outpatient visit.Methods: Data were derived from two cohorts in Taiwan. The first cohort included 970 hypertensive patients recruited from six medical centers between 2004 and 2005, which were split into a training set (n = 679), a validation set (n = 146), and a test set (n = 145) for model development and internal validation. The second cohort included 416 hypertensive patients recruited from a single medical center between 2012 and 2020, which was used for external validation. We used 33 clinical characteristics as candidate variables to develop models based on logistic regression (LR), random forest (RF), eXtreme Gradient Boosting (XGboost), and artificial neural network (ANN).Results: The four models featured high sensitivity and high negative predictive value (NPV) in internal validation (sensitivity = 0.914–1.000; NPV = 0.853–1.000) and external validation (sensitivity = 0.950–1.000; NPV = 0.875–1.000). The RF, XGboost, and ANN models showed much higher area under the receiver operating characteristic curve (AUC) (0.799–0.851 in internal validation, 0.672–0.837 in external validation) than the LR model. Among the models, the RF model, composed of 6 predictor variables, had the best overall performance in both internal and external validation (AUC = 0.851 and 0.837; sensitivity = 1.000 and 1.000; specificity = 0.609 and 0.580; NPV = 1.000 and 1.000; accuracy = 0.766 and 0.721, respectively).Conclusion: An effective machine learning-based predictive model that requires data from a single clinic visit may help to identify masked hypertension and masked uncontrolled hypertension.

Download Full-text