scholarly journals Comparison of Multivariable Logistic Regression and Other Machine Learning Algorithms for Prognostic Prediction Studies in Pregnancy Care: Systematic Review and Meta-Analysis

10.2196/16503 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e16503
Author(s):  
Herdiantri Sufriyana ◽  
Atina Husnayain ◽  
Ya-Lin Chen ◽  
Chao-Yang Kuo ◽  
Onkar Singh ◽  
...  

Background Predictions in pregnancy care are complex because of interactions among multiple factors. Hence, pregnancy outcomes are not easily predicted by a single predictor using only one algorithm or modeling method. Objective This study aims to review and compare the predictive performances between logistic regression (LR) and other machine learning algorithms for developing or validating a multivariable prognostic prediction model for pregnancy care to inform clinicians’ decision making. Methods Research articles from MEDLINE, Scopus, Web of Science, and Google Scholar were reviewed following several guidelines for a prognostic prediction study, including a risk of bias (ROB) assessment. We report the results based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Studies were primarily framed as PICOTS (population, index, comparator, outcomes, timing, and setting): Population: men or women in procreative management, pregnant women, and fetuses or newborns; Index: multivariable prognostic prediction models using non-LR algorithms for risk classification to inform clinicians’ decision making; Comparator: the models applying an LR; Outcomes: pregnancy-related outcomes of procreation or pregnancy outcomes for pregnant women and fetuses or newborns; Timing: pre-, inter-, and peripregnancy periods (predictors), at the pregnancy, delivery, and either puerperal or neonatal period (outcome), and either short- or long-term prognoses (time interval); and Setting: primary care or hospital. The results were synthesized by reporting study characteristics and ROBs and by random effects modeling of the difference of the logit area under the receiver operating characteristic curve of each non-LR model compared with the LR model for the same pregnancy outcomes. We also reported between-study heterogeneity by using τ2 and I2. Results Of the 2093 records, we included 142 studies for the systematic review and 62 studies for a meta-analysis. Most prediction models used LR (92/142, 64.8%) and artificial neural networks (20/142, 14.1%) among non-LR algorithms. Only 16.9% (24/142) of studies had a low ROB. A total of 2 non-LR algorithms from low ROB studies significantly outperformed LR. The first algorithm was a random forest for preterm delivery (logit AUROC 2.51, 95% CI 1.49-3.53; I2=86%; τ2=0.77) and pre-eclampsia (logit AUROC 1.2, 95% CI 0.72-1.67; I2=75%; τ2=0.09). The second algorithm was gradient boosting for cesarean section (logit AUROC 2.26, 95% CI 1.39-3.13; I2=75%; τ2=0.43) and gestational diabetes (logit AUROC 1.03, 95% CI 0.69-1.37; I2=83%; τ2=0.07). Conclusions Prediction models with the best performances across studies were not necessarily those that used LR but also used random forest and gradient boosting that also performed well. We recommend a reanalysis of existing LR models for several pregnancy outcomes by comparing them with those algorithms that apply standard guidelines. Trial Registration PROSPERO (International Prospective Register of Systematic Reviews) CRD42019136106; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=136106

2019 ◽  
Author(s):  
Herdiantri Sufriyana ◽  
Atina Husnayain ◽  
Ya-Lin Chen ◽  
Chao-Yang Kuo ◽  
Onkar Singh ◽  
...  

BACKGROUND Predictions in pregnancy care are complex because of interactions among multiple factors. Hence, pregnancy outcomes are not easily predicted by a single predictor using only one algorithm or modeling method. OBJECTIVE This study aims to review and compare the predictive performances between logistic regression (LR) and other machine learning algorithms for developing or validating a multivariable prognostic prediction model for pregnancy care to inform clinicians’ decision making. METHODS Research articles from MEDLINE, Scopus, Web of Science, and Google Scholar were reviewed following several guidelines for a prognostic prediction study, including a risk of bias (ROB) assessment. We report the results based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Studies were primarily framed as PICOTS (population, index, comparator, outcomes, timing, and setting): Population: men or women in procreative management, pregnant women, and fetuses or newborns; Index: multivariable prognostic prediction models using non-LR algorithms for risk classification to inform clinicians’ decision making; Comparator: the models applying an LR; Outcomes: pregnancy-related outcomes of procreation or pregnancy outcomes for pregnant women and fetuses or newborns; Timing: pre-, inter-, and peripregnancy periods (predictors), at the pregnancy, delivery, and either puerperal or neonatal period (outcome), and either short- or long-term prognoses (time interval); and Setting: primary care or hospital. The results were synthesized by reporting study characteristics and ROBs and by random effects modeling of the difference of the logit area under the receiver operating characteristic curve of each non-LR model compared with the LR model for the same pregnancy outcomes. We also reported between-study heterogeneity by using <i>τ<sup>2</sup></i> and <i>I<sup>2</sup></i>. RESULTS Of the 2093 records, we included 142 studies for the systematic review and 62 studies for a meta-analysis. Most prediction models used LR (92/142, 64.8%) and artificial neural networks (20/142, 14.1%) among non-LR algorithms. Only 16.9% (24/142) of studies had a low ROB. A total of 2 non-LR algorithms from low ROB studies significantly outperformed LR. The first algorithm was a random forest for preterm delivery (logit AUROC 2.51, 95% CI 1.49-3.53; <i>I<sup>2</sup></i>=86%; <i>τ<sup>2</sup></i>=0.77) and pre-eclampsia (logit AUROC 1.2, 95% CI 0.72-1.67; <i>I<sup>2</sup></i>=75%; <i>τ<sup>2</sup></i>=0.09). The second algorithm was gradient boosting for cesarean section (logit AUROC 2.26, 95% CI 1.39-3.13; <i>I<sup>2</sup></i>=75%; <i>τ<sup>2</sup></i>=0.43) and gestational diabetes (logit AUROC 1.03, 95% CI 0.69-1.37; <i>I<sup>2</sup></i>=83%; <i>τ<sup>2</sup></i>=0.07). CONCLUSIONS Prediction models with the best performances across studies were not necessarily those that used LR but also used random forest and gradient boosting that also performed well. We recommend a reanalysis of existing LR models for several pregnancy outcomes by comparing them with those algorithms that apply standard guidelines. CLINICALTRIAL PROSPERO (International Prospective Register of Systematic Reviews) CRD42019136106; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=136106


2019 ◽  
Author(s):  
Sun Jae Moon ◽  
Jin Seub Hwang ◽  
Rajesh Kana ◽  
John Torous ◽  
Jung Won Kim

BACKGROUND Over the recent years, machine learning algorithms have been more widely and increasingly applied in biomedical fields. In particular, its application has been drawing more attention in the field of psychiatry, for instance, as diagnostic tests/tools for autism spectrum disorder. However, given its complexity and potential clinical implications, there is ongoing need for further research on its accuracy. OBJECTIVE The current study aims to summarize the evidence for the accuracy of use of machine learning algorithms in diagnosing autism spectrum disorder (ASD) through systematic review and meta-analysis. METHODS MEDLINE, Embase, CINAHL Complete (with OpenDissertations), PsyINFO and IEEE Xplore Digital Library databases were searched on November 28th, 2018. Studies, which used a machine learning algorithm partially or fully in classifying ASD from controls and provided accuracy measures, were included in our analysis. Bivariate random effects model was applied to the pooled data in meta-analysis. Subgroup analysis was used to investigate and resolve the source of heterogeneity between studies. True-positive, false-positive, false negative and true-negative values from individual studies were used to calculate the pooled sensitivity and specificity values, draw SROC curves, and obtain area under the curve (AUC) and partial AUC. RESULTS A total of 43 studies were included for the final analysis, of which meta-analysis was performed on 40 studies (53 samples with 12,128 participants). A structural MRI subgroup meta-analysis (12 samples with 1,776 participants) showed the sensitivity at 0.83 (95% CI-0.76 to 0.89), specificity at 0.84 (95% CI -0.74 to 0.91), and AUC/pAUC at 0.90/0.83. An fMRI/deep neural network (DNN) subgroup meta-analysis (five samples with 1,345 participants) showed the sensitivity at 0.69 (95% CI- 0.62 to 0.75), the specificity at 0.66 (95% CI -0.61 to 0.70), and AUC/pAUC at 0.71/0.67. CONCLUSIONS Machine learning algorithms that used structural MRI features in diagnosis of ASD were shown to have accuracy that is similar to currently used diagnostic tools.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Matthijs Blankers ◽  
Louk F. M. van der Post ◽  
Jack J. M. Dekker

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.


2020 ◽  
Vol 25 ◽  
pp. 100446 ◽  
Author(s):  
Asma Khalil ◽  
Erkan Kalafat ◽  
Can Benlioglu ◽  
Pat O'Brien ◽  
Edward Morris ◽  
...  

2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Shamil D. Cooray ◽  
Jacqueline A. Boyle ◽  
Georgia Soldatos ◽  
Lihini A. Wijeyaratne ◽  
Helena J. Teede

Abstract Background Gestational diabetes (GDM) is increasingly common and has significant implications during pregnancy and for the long-term health of the mother and offspring. However, it is a heterogeneous condition with inter-related factors including ethnicity, body mass index and gestational weight gain significantly modifying the absolute risk of complications at an individual level. Predicting the risk of pregnancy complications for an individual woman with GDM presents a useful adjunct to therapeutic decision-making and patient education. Diagnostic prediction models for GDM are prevalent. In contrast, prediction models for risk of complications in those with GDM are relatively novel. This study will systematically review published prognostic prediction models for pregnancy complications in women with GDM, describe their characteristics, compare performance and assess methodological quality and applicability. Methods Studies will be identified by searching MEDLINE and Embase electronic databases. Title and abstract screening, full-text review and data extraction will be completed independently by two reviewers. The included studies will be systematically assessed for risk of bias and applicability using appropriate tools designed for prediction modelling studies. Extracted data will be tabulated to facilitate qualitative comparison of published prediction models. Quantitative data on predictive performance of these models will be synthesised with meta-analyses if appropriate. Discussion This review will identify and summarise all published prognostic prediction models for pregnancy complications in women with GDM. We will compare model performance across different settings and populations with meta-analysis if appropriate. This work will guide subsequent phases in the prognosis research framework: further model development, external validation and model updating, and impact assessment. The ultimate model will estimate the absolute risk of pregnancy complications for women with GDM and will be implemented into routine care as an evidence-based GDM complication risk prediction model. It is anticipated to offer value to women and their clinicians with individualised risk assessment and may assist decision-making. Ultimately, this systematic review is an important step towards a personalised risk-stratified model-of-care for GDM to allow preventative and therapeutic interventions for the maximal benefit to women and their offspring, whilst sparing expense and harm for those at low risk. Systematic review registration PROSPERO registration number CRD42019115223


10.2196/14108 ◽  
2019 ◽  
Vol 6 (12) ◽  
pp. e14108 ◽  
Author(s):  
Sun Jae Moon ◽  
Jinseub Hwang ◽  
Rajesh Kana ◽  
John Torous ◽  
Jung Won Kim

Background In the recent years, machine learning algorithms have been more widely and increasingly applied in biomedical fields. In particular, their application has been drawing more attention in the field of psychiatry, for instance, as diagnostic tests/tools for autism spectrum disorder (ASD). However, given their complexity and potential clinical implications, there is an ongoing need for further research on their accuracy. Objective This study aimed to perform a systematic review and meta-analysis to summarize the available evidence for the accuracy of machine learning algorithms in diagnosing ASD. Methods The following databases were searched on November 28, 2018: MEDLINE, EMBASE, CINAHL Complete (with Open Dissertations), PsycINFO, and Institute of Electrical and Electronics Engineers Xplore Digital Library. Studies that used a machine learning algorithm partially or fully for distinguishing individuals with ASD from control subjects and provided accuracy measures were included in our analysis. The bivariate random effects model was applied to the pooled data in a meta-analysis. A subgroup analysis was used to investigate and resolve the source of heterogeneity between studies. True-positive, false-positive, false-negative, and true-negative values from individual studies were used to calculate the pooled sensitivity and specificity values, draw Summary Receiver Operating Characteristics curves, and obtain the area under the curve (AUC) and partial AUC (pAUC). Results A total of 43 studies were included for the final analysis, of which a meta-analysis was performed on 40 studies (53 samples with 12,128 participants). A structural magnetic resonance imaging (sMRI) subgroup meta-analysis (12 samples with 1776 participants) showed a sensitivity of 0.83 (95% CI 0.76-0.89), a specificity of 0.84 (95% CI 0.74-0.91), and AUC/pAUC of 0.90/0.83. A functional magnetic resonance imaging/deep neural network subgroup meta-analysis (5 samples with 1345 participants) showed a sensitivity of 0.69 (95% CI 0.62-0.75), specificity of 0.66 (95% CI 0.61-0.70), and AUC/pAUC of 0.71/0.67. Conclusions The accuracy of machine learning algorithms for diagnosis of ASD was considered acceptable by few accuracy measures only in cases of sMRI use; however, given the many limitations indicated in our study, further well-designed studies are warranted to extend the potential use of machine learning algorithms to clinical settings. Trial Registration PROSPERO CRD42018117779; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=117779


2021 ◽  
Author(s):  
Yaltafit Abror Jeem ◽  
Refa Nabila ◽  
Dwi Ditha Emelia ◽  
Lutfan Lazuardi ◽  
Hari Kusnanto Josef

Abstract Background One strategy to resolve the increasing prevalence of T2DM is to identify and administer interventions to prediabetes patients. Risk assessment tools help detect diseases, by allowing screening to the high risk group. Machine learning is also used to help diagnosis and identification of prediabetes. This review aims to determine the diagnostic test accuracy of various machine learning algorithms for calculating prediabetes risk.Methods This protocol was written in compliance with the Preferred Reporting Items for Systematic Review and Meta-Analysis for Protocols (PRISMA-P) statement. The databases that will be used include PubMed, ProQuest and EBSCO restricted to January 1999 and May 2019 in English language only. Identification of articles will be done independently by two reviewers through the titles, the abstracts, and then the full-text-articles. Any disagreement will be resolved by consensus. The Newcastle-Ottawa Quality Assessment Scale will be used to measure the quality and potential of bias. Data extraction and content analysis will be performed systematically. Quantitative data will be visualized using a forest plot with the 95% Confidence Intervals. The diagnostic test outcome will be described by the summary receiver operating characteristic curve. Data will be analyzed using Review Manager 5.3 (RevMan 5.3) software package.Discussion We will obtain diagnostic accuracy of various machine learning algorithms for prediabetes risk estimation using this proposed systematic review and meta-analysis. Systematic review registration: This protocol has been registered in the Prospective Registry of Systematic Review (PROSPERO) database. The registration number is CRD42021251242.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Susan Idicula-Thomas ◽  
Ulka Gawde ◽  
Prabhat Jha

Abstract Background Machine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research. In this study, we have explored the application of ML-based algorithms to predict cause of death (CoD) from verbal autopsy records available through the Million Death Study (MDS). Methods From MDS, 18826 unique childhood deaths at ages 1–59 months during the time period 2004–13 were selected for generating the prediction models of which over 70% of deaths were caused by six infectious diseases (pneumonia, diarrhoeal diseases, malaria, fever of unknown origin, meningitis/encephalitis, and measles). Six popular ML-based algorithms such as support vector machine, gradient boosting modeling, C5.0, artificial neural network, k-nearest neighbor, classification and regression tree were used for building the CoD prediction models. Results SVM algorithm was the best performer with a prediction accuracy of over 0.8. The highest accuracy was found for diarrhoeal diseases (accuracy = 0.97) and the lowest was for meningitis/encephalitis (accuracy = 0.80). The top signs/symptoms for classification of these CoDs were also extracted for each of the diseases. A combination of signs/symptoms presented by the deceased individual can effectively lead to the CoD diagnosis. Conclusions Overall, this study affirms that verbal autopsy tools are efficient in CoD diagnosis and that automated classification parameters captured through ML could be added to verbal autopsies to improve classification of causes of death.


PLoS Medicine ◽  
2021 ◽  
Vol 18 (11) ◽  
pp. e1003856
Author(s):  
Sophie Relph ◽  
Trusha Patel ◽  
Louisa Delaney ◽  
Soha Sobhy ◽  
Shakila Thangaratinam

Background The rise in the global prevalence of diabetes, particularly among younger people, has led to an increase in the number of pregnant women with preexisting diabetes, many of whom have diabetes-related microvascular complications. We aimed to estimate the magnitude of the risks of adverse pregnancy outcomes or disease progression in this population. Methods and findings We undertook a systematic review and meta-analysis on maternal and perinatal complications in women with type 1 or 2 diabetic microvascular disease and the risk factors for worsening of microvascular disease in pregnancy using a prospective protocol (PROSPERO CRD42017076647). We searched major databases (January 1990 to July 2021) for relevant cohort studies. Study quality was assessed using the Newcastle–Ottawa Scale. We summarized the findings as odds ratios (ORs) with 95% confidence intervals (CIs) using random effects meta-analysis. We included 56 cohort studies involving 12,819 pregnant women with diabetes; 40 from Europe and 9 from North America. Pregnant women with diabetic nephropathy were at greater risk of preeclampsia (OR 10.76, CI 6.43 to 17.99, p < 0.001), early (<34 weeks) (OR 6.90, 95% CI 3.38 to 14.06, p < 0.001) and any preterm birth (OR 4.48, CI 3.40 to 5.92, p < 0.001), and cesarean section (OR 3.04, CI 1.24 to 7.47, p = 0.015); their babies were at increased risk of perinatal death (OR 2.26, CI 1.07 to 4.75, p = 0.032), congenital abnormality (OR 2.71, CI 1.58 to 4.66, p < 0.001), small for gestational age (OR 16.89, CI 7.07 to 40.37, p < 0.001), and admission to neonatal unit (OR 2.59, CI 1.72 to 3.90, p < 0.001) than those without nephropathy. Diabetic retinopathy was associated with any preterm birth (OR 1.67, CI 1.27 to 2.20, p < 0.001) and preeclampsia (OR 2.20, CI 1.57 to 3.10, p < 0.001) but not other complications. The risks of onset or worsening of retinopathy were increased in women who were nulliparous (OR 1.75, 95% CI 1.28 to 2.40, p < 0.001), smokers (OR 2.31, 95% CI 1.25 to 4.27, p = 0.008), with existing proliferative disease (OR 2.12, 95% CI 1.11 to 4.04, p = 0.022), and longer duration of diabetes (weighted mean difference: 4.51 years, 95% CI 2.26 to 6.76, p < 0.001) than those without the risk factors. The main limitations of this analysis are the heterogeneity of definition of retinopathy and nephropathy and the inclusion of women both with type 1 and type 2 diabetes. Conclusions In pregnant women with diabetes, presence of nephropathy and/or retinopathy appear to further increase the risks of maternal complications.


Sign in / Sign up

Export Citation Format

Share Document