scholarly journals Risk Prediction of Dyslipidemia for Chinese Han Adults Using Random Forest Survival Model

2019 ◽  
Vol Volume 11 ◽  
pp. 1047-1055 ◽  
Author(s):  
Xiaoshuai Zhang ◽  
Fang Tang ◽  
Jiadong Ji ◽  
Wenting Han ◽  
Peng Lu
2018 ◽  
Vol 33 (4) ◽  
pp. 487-491 ◽  
Author(s):  
Bing Zhao ◽  
Miaomiao Zhang ◽  
Feng Lin ◽  
Jing Xie ◽  
Yan Liang ◽  
...  

Objective: The aim of this study is to establish the reference interval for serum pro-gastrin-releasing peptide (proGRP) determined by electrochemiluminescence immunoassay (ECLIA) in healthy Chinese Han ethnic adults. Methods: After screening, 9932 healthy Chinese Han adults (age range 18–95 years) were enrolled in this study, including 6220 men and 3712 women. Serum proGRP levels were measured by ECLIA. The reference interval was defined by non-parametric 95th percentile interval. Results: Serum proGRP levels conformed to a non-Gussian distribution. The reference interval for healthy Chinese Han adults calculated by the non-parametric method was 0–73.90 ng/mL in this study. Since serum proGRP levels were significantly correlated with age (r=0.226, P<0.001), the participants were divided into six age groups: 18–39, 40–49, 50–59, 60–69, 70–79, and ⩾80 years. No significant difference for serum proGRP levels was found between the sexes at each of six age groups. The reference intervals were gradually increased with age (65.35 ng/mL, 68.65 ng/mL, 74.10 ng/mL, 77.65 ng/mL, 84.57 ng/mL, and 98.03 ng/mL in 18–39, 40–49, 50–59, 60–69, 70–79, and ⩾80 years, respectively). Conclusions: We established the reference interval for serum proGRP, which was determined by ECLIA in the healthy Chinese Han population. Furthermore, our study suggests that it is necessary to establish the age-specific reference intervals for serum proGRP.


PLoS ONE ◽  
2012 ◽  
Vol 7 (7) ◽  
pp. e39726 ◽  
Author(s):  
Jiapeng Lu ◽  
Yuqing Huang ◽  
Youxin Wang ◽  
Yan Li ◽  
Yujun Zhang ◽  
...  
Keyword(s):  

2021 ◽  
Author(s):  
Ilkin Bayramli ◽  
Victor Castro ◽  
Yuval Barak-Corrren ◽  
Emily Masden ◽  
Matthew Nock ◽  
...  

Background. Suicide is one of the leading causes of death worldwide, yet clinicians find it difficult to reliably identify individuals at high risk for suicide. Algorithmic approaches for suicide risk detection have been developed in recent years, mostly based on data from electronics health records (EHRs). These models typically do not optimally exploit the valuable temporal information inherent in these longitudinal data. Methods. We propose a temporally enhanced variant of the Random Forest model - Omni-Temporal Balanced Random Forests (OTBRFs) - that incorporates temporal information in every tree within the forest. We develop and validate this model using longitudinal EHRs and clinician notes from the Mass General Brigham Health System recorded between 1998 and 2018, and compare its performance to a baseline Naive Bayes Classifier and two standard versions of Balanced Random Forests. Results. Temporal variables were found to be associated with suicide risk. RF models were more accurate than Naive Bayesian classifiers at predicting suicide risk in advance (AUC=0.824 vs. 0.754 respectively). The OT-BRF model performed best among all RF approaches (0.339 sensitivity at 95% specificity), compared to 0.290 and 0.286 for the other two RF models. Temporal variables were assigned high importance by the models that incorporated them. Discussion. We demonstrate that temporal variables have an important role to play in suicide risk detection, and that requiring their inclusion in all random forest trees leads to increased predictive performance. Integrating temporal information into risk prediction models helps the models interpret patient data in temporal context, improving predictive performance.


Author(s):  
Nahúm Cueto López ◽  
María Teresa García-Ordás ◽  
Facundo Vitelli-Storelli ◽  
Pablo Fernández-Navarro ◽  
Camilo Palazuelos ◽  
...  

This study evaluates several feature ranking techniques together with some classifiers based on machine learning to identify relevant factors regarding the probability of contracting breast cancer and improve the performance of risk prediction models for breast cancer in a healthy population. The dataset with 919 cases and 946 controls comes from the MCC-Spain study and includes only environmental and genetic features. Breast cancer is a major public health problem. Our aim is to analyze which factors in the cancer risk prediction model are the most important for breast cancer prediction. Likewise, quantifying the stability of feature selection methods becomes essential before trying to gain insight into the data. This paper assesses several feature selection algorithms in terms of performance for a set of predictive models. Furthermore, their robustness is quantified to analyze both the similarity between the feature selection rankings and their own stability. The ranking provided by the SVM-RFE approach leads to the best performance in terms of the area under the ROC curve (AUC) metric. Top-47 ranked features obtained with this approach fed to the Logistic Regression classifier achieve an AUC = 0.616. This means an improvement of 5.8% in comparison with the full feature set. Furthermore, the SVM-RFE ranking technique turned out to be highly stable (as well as Random Forest), whereas relief and the wrapper approaches are quite unstable. This study demonstrates that the stability and performance of the model should be studied together as Random Forest and SVM-RFE turned out to be the most stable algorithms, but in terms of model performance SVM-RFE outperforms Random Forest.


Sign in / Sign up

Export Citation Format

Share Document