Development and validation of a candidemia risk prediction (CanDETEC) model among patients with malignancy (Preprint)
BACKGROUND Appropriate empirical treatment for candidemia is associated with reduced mortality. However, timely diagnosis of candidemia for septic patients remains poor. OBJECTIVE This study aimed to use machine learning algorithms to develop and validate a candidemia prediction model for cancer patients. METHODS This single-center retrospective study used the cancer registry of a tertiary academic hospital. Adult patients with diagnosed malignancies from January 2010 to December 2018 were included. Our study outcome was the prediction of candidemia events. A stratified under-sampling method was used to extract control groups for algorithm learning. Multiple models were developed through a combination of four variable groups and five algorithms (auto-machine learning, deep neural network, gradient boosting, and logistic regression and random forest). The model with the highest area under the receiver operating characteristics (AUROC) was selected as the Candida species detection (CanDETEC) model and compared with the performance indexes of the candida score. RESULTS Among the 273,380 blood culture from 186,404 registered cancer patients, 501 candidemia events and 2000 controls were identified. The AUROC of the developed models varied from 0.771 to 0.889. The random forest model was selected CanDETEC model (AUROC = 0.889, 95% confidence interval: 0.888-0.889). It showed better performance than the candida score (AUROC = 0.677). CONCLUSIONS The CanDETEC model could predict candidemia in cancer patients with high discriminative power. This algorithm could be used for the timely diagnosis and appropriate empirical treatment of candidemia.