scholarly journals Performance Analysis of Conventional Machine Learning Algorithms for Identification of Chronic Kidney Disease in Type 1 Diabetes Mellitus Patients

Diagnostics ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 2267
Author(s):  
Nakib Hayat Chowdhury ◽  
Mamun Bin Ibne Reaz ◽  
Fahmida Haque ◽  
Shamim Ahmad ◽  
Sawal Hamid Md Ali ◽  
...  

Chronic kidney disease (CKD) is one of the severe side effects of type 1 diabetes mellitus (T1DM). However, the detection and diagnosis of CKD are often delayed because of its asymptomatic nature. In addition, patients often tend to bypass the traditional urine protein (urinary albumin)-based CKD detection test. Even though disease detection using machine learning (ML) is a well-established field of study, it is rarely used to diagnose CKD in T1DM patients. This research aimed to employ and evaluate several ML algorithms to develop models to quickly predict CKD in patients with T1DM using easily available routine checkup data. This study analyzed 16 years of data of 1375 T1DM patients, obtained from the Epidemiology of Diabetes Interventions and Complications (EDIC) clinical trials directed by the National Institute of Diabetes, Digestive, and Kidney Diseases, USA. Three data imputation techniques (RF, KNN, and MICE) and the SMOTETomek resampling technique were used to preprocess the primary dataset. Ten ML algorithms including logistic regression (LR), k-nearest neighbor (KNN), Gaussian naïve Bayes (GNB), support vector machine (SVM), stochastic gradient descent (SGD), decision tree (DT), gradient boosting (GB), random forest (RF), extreme gradient boosting (XGB), and light gradient-boosted machine (LightGBM) were applied to developed prediction models. Each model included 19 demographic, medical history, behavioral, and biochemical features, and every feature’s effect was ranked using three feature ranking techniques (XGB, RF, and Extra Tree). Lastly, each model’s ROC, sensitivity (recall), specificity, accuracy, precision, and F-1 score were estimated to find the best-performing model. The RF classifier model exhibited the best performance with 0.96 (±0.01) accuracy, 0.98 (±0.01) sensitivity, and 0.93 (±0.02) specificity. LightGBM performed second best and was quite close to RF with 0.95 (±0.06) accuracy. In addition to these two models, KNN, SVM, DT, GB, and XGB models also achieved more than 90% accuracy.

2021 ◽  
Vol 11 (4) ◽  
pp. 1742
Author(s):  
Ignacio Rodríguez-Rodríguez ◽  
José-Víctor Rodríguez ◽  
Wai Lok Woo ◽  
Bo Wei ◽  
Domingo-Javier Pardo-Quiles

Type 1 diabetes mellitus (DM1) is a metabolic disease derived from falls in pancreatic insulin production resulting in chronic hyperglycemia. DM1 subjects usually have to undertake a number of assessments of blood glucose levels every day, employing capillary glucometers for the monitoring of blood glucose dynamics. In recent years, advances in technology have allowed for the creation of revolutionary biosensors and continuous glucose monitoring (CGM) techniques. This has enabled the monitoring of a subject’s blood glucose level in real time. On the other hand, few attempts have been made to apply machine learning techniques to predicting glycaemia levels, but dealing with a database containing such a high level of variables is problematic. In this sense, to the best of the authors’ knowledge, the issues of proper feature selection (FS)—the stage before applying predictive algorithms—have not been subject to in-depth discussion and comparison in past research when it comes to forecasting glycaemia. Therefore, in order to assess how a proper FS stage could improve the accuracy of the glycaemia forecasted, this work has developed six FS techniques alongside four predictive algorithms, applying them to a full dataset of biomedical features related to glycaemia. These were harvested through a wide-ranging passive monitoring process involving 25 patients with DM1 in practical real-life scenarios. From the obtained results, we affirm that Random Forest (RF) as both predictive algorithm and FS strategy offers the best average performance (Root Median Square Error, RMSE = 18.54 mg/dL) throughout the 12 considered predictive horizons (up to 60 min in steps of 5 min), showing Support Vector Machines (SVM) to have the best accuracy as a forecasting algorithm when considering, in turn, the average of the six FS techniques applied (RMSE = 20.58 mg/dL).


Diabetologia ◽  
2017 ◽  
Vol 60 (6) ◽  
pp. 1102-1113 ◽  
Author(s):  
Giuseppe Penno ◽  
Eleonora Russo ◽  
Monia Garofolo ◽  
Giuseppe Daniele ◽  
Daniela Lucchesi ◽  
...  

2015 ◽  
Vol 87 (10) ◽  
pp. 54 ◽  
Author(s):  
M. S. Arutyunova ◽  
A. M. Glazunova ◽  
O. V. Mikhaleva ◽  
Z. T. Zuraeva ◽  
S. A. Martynov ◽  
...  

2022 ◽  
pp. ASN.2021040538
Author(s):  
Arthur M. Lee ◽  
Jian Hu ◽  
Yunwen Xu ◽  
Alison G. Abraham ◽  
Rui Xiao ◽  
...  

BackgroundUntargeted plasma metabolomic profiling combined with machine learning (ML) may lead to discovery of metabolic profiles that inform our understanding of pediatric CKD causes. We sought to identify metabolomic signatures in pediatric CKD based on diagnosis: FSGS, obstructive uropathy (OU), aplasia/dysplasia/hypoplasia (A/D/H), and reflux nephropathy (RN).MethodsUntargeted metabolomic quantification (GC-MS/LC-MS, Metabolon) was performed on plasma from 702 Chronic Kidney Disease in Children study participants (n: FSGS=63, OU=122, A/D/H=109, and RN=86). Lasso regression was used for feature selection, adjusting for clinical covariates. Four methods were then applied to stratify significance: logistic regression, support vector machine, random forest, and extreme gradient boosting. ML training was performed on 80% total cohort subsets and validated on 20% holdout subsets. Important features were selected based on being significant in at least two of the four modeling approaches. We additionally performed pathway enrichment analysis to identify metabolic subpathways associated with CKD cause.ResultsML models were evaluated on holdout subsets with receiver-operator and precision-recall area-under-the-curve, F1 score, and Matthews correlation coefficient. ML models outperformed no-skill prediction. Metabolomic profiles were identified based on cause. FSGS was associated with the sphingomyelin-ceramide axis. FSGS was also associated with individual plasmalogen metabolites and the subpathway. OU was associated with gut microbiome–derived histidine metabolites.ConclusionML models identified metabolomic signatures based on CKD cause. Using ML techniques in conjunction with traditional biostatistics, we demonstrated that sphingomyelin-ceramide and plasmalogen dysmetabolism are associated with FSGS and that gut microbiome–derived histidine metabolites are associated with OU.


2021 ◽  
Author(s):  
Jian Lin ◽  
Yuanhua Lu ◽  
Bizhou Wang ◽  
Ping Jiao ◽  
Jie Ma

Abstract Background Type 1 diabetes mellitus (T1DM) is a chronic autoimmune disease caused by severe loss of pancreatic β cells. Immune cells are key mediators of β cell destruction. This study attempted to investigate the role of immune cells and immune-related genes in the occurrence and development of T1DM. Methods The raw gene expression profile of the samples from 12 T1DM patients and 10 normal controls was obtained from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were identified by Limma package in R. The least absolute shrinkage and selection operator (LASSO) - support vector machines (SVM) were used to screen the hub genes. CIBERSORT algorithm was used to identify the different immune cells in distribution between T1DM and normal samples. Correlation of the hub genes and immune cells was analyzed by Spearman, and gene-GO-BP and gene-pathway interaction networks were constructed by Cytoscape plug-in ClueGO. Receiver operating characteristic (ROC) curves were used to assess diagnostic value of genes in T1DM. Results The 50 immune-related DEGs were obtained between the T1DM and normal samples. Then, the 50 immune-related DEGs were further screened to obtain the 5 hub genes. CIBERSORT analysis revealed that the distribution of plasma cells, resting mast cells, resting NK cells and neutrophils had significant difference between T1DM and normal samples. Natural cytotoxicity triggering receptor 3 (NCR3) was significantly related to the activated NK cells, M0 macrophages, monocytes, resting NK cells, and resting memory CD4+ T cells. Moreover, tumor necrosis factor (TNF) was significantly associated with naive B cell and naive CD4+ T cell. NCR3 [Area under curve (AUC) = 0.918] possessed a higher accuracy than TNF (AUC = 0.763) in diagnosis of T1DM. Conclusions The immune-related genes (NCR3 and TNF) and immune cells (NK cells) may play a vital regulatory role in the occurrence and development of T1DM, which possibly provide new ideas and potential targets for the immunotherapy of diabetes mellitus (DM).


Sign in / Sign up

Export Citation Format

Share Document