Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring

Thomas A. Gerds; Michael W. Kattan; Martin Schumacher; Changhong Yu

doi:10.1002/sim.5681

Deep Learning-Based Survival Analysis for High-Dimensional Survival Data

Mathematics ◽

10.3390/math9111244 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1244

Author(s):

Lin Hao ◽

Juncheol Kim ◽

Sookhee Kwon ◽

Il Do Ha

Keyword(s):

Survival Data ◽

Prediction Models ◽

Prediction Performance ◽

Time Dependent ◽

Tuning Parameter ◽

High Dimensional ◽

Brier Score ◽

Survival Prediction ◽

Optimal Setting ◽

Selection Of

With the development of high-throughput technologies, more and more high-dimensional or ultra-high-dimensional genomic data are being generated. Therefore, effectively analyzing such data has become a significant challenge. Machine learning (ML) algorithms have been widely applied for modeling nonlinear and complicated interactions in a variety of practical fields such as high-dimensional survival data. Recently, multilayer deep neural network (DNN) models have made remarkable achievements. Thus, a Cox-based DNN prediction survival model (DNNSurv model), which was built with Keras and TensorFlow, was developed. However, its results were only evaluated on the survival datasets with high-dimensional or large sample sizes. In this paper, we evaluated the prediction performance of the DNNSurv model using ultra-high-dimensional and high-dimensional survival datasets and compared it with three popular ML survival prediction models (i.e., random survival forest and the Cox-based LASSO and Ridge models). For this purpose, we also present the optimal setting of several hyperparameters, including the selection of a tuning parameter. The proposed method demonstrated via data analysis that the DNNSurv model performed well overall as compared with the ML models, in terms of the three main evaluation measures (i.e., concordance index, time-dependent Brier score, and the time-dependent AUC) for survival prediction performance.

Download Full-text

Deep Learning-based Survival Analysis for High-dimensional Survival Data

10.20944/preprints202104.0529.v1 ◽

2021 ◽

Author(s):

Il Do Ha ◽

Lin Hao ◽

Juncheol Kim ◽

Sookhee Kwon

Keyword(s):

Survival Data ◽

Prediction Models ◽

Prediction Performance ◽

Time Dependent ◽

Tuning Parameter ◽

High Dimensional ◽

Brier Score ◽

Survival Prediction ◽

Optimal Setting ◽

Selection Of

As the development of high-throughput technologies, more and more high-dimensional or ultra high-dimensional genomic data are generated. Therefore, how to make effective analysis of such data becomes a challenge. Machine learning (ML) algorithms have been widely applied for modelling nonlinear and complicated interactions in a variety of practical fields such as high-dimensional survival data. Recently, the multilayer deep neural network (DNN) models have made remarkable achievements. Thus, a Cox-based DNN prediction survival model (DNNSurv model) , which was built with Keras and Tensorflow, was developed. However, its results were only evaluated to the survival datasets with high-dimensional or large sample sizes. In this paper, we evaluate the prediction performance of the DNNSurv model using ultra high-dimensional and high-dimensional survival datasets, and compare it with three popular ML survival prediction models (i.e., random survival forest and Cox-based LASSO and Ridge models). For this purpose we also present the optimal setting of several hyper-parameters including selection of tuning parameter. The proposed method demonstrates via data analysis that the DNNSurv model performs overall well as compared with the ML models, in terms of three main evaluation measures (i.e., concordance index, time-dependent Brier score and time-dependent AUC) for survival prediction performance.

Download Full-text

Faculty Opinions recommendation of Assessment of survival prediction models based on microarray data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1088585.542666 ◽

2007 ◽

Author(s):

Ewout Steyerberg

Keyword(s):

Microarray Data ◽

Prediction Models ◽

Survival Prediction

Download Full-text

U-survival for prognostic prediction of disease progression and mortality of patients with COVID-19

Scientific Reports ◽

10.1038/s41598-021-88591-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Janne J. Näppi ◽

Tomoki Uemura ◽

Chinatsu Watari ◽

Toru Hironaka ◽

Tohru Kamiya ◽

...

Keyword(s):

Risk Groups ◽

Healthcare Services ◽

Chest Ct ◽

Prediction Performance ◽

Survival Prediction ◽

Concordance Index ◽

Survival Curves ◽

Analysis Methodology ◽

Kaplan Meier ◽

Prognostic Prediction

AbstractThe rapid increase of patients with coronavirus disease 2019 (COVID-19) has introduced major challenges to healthcare services worldwide. Therefore, fast and accurate clinical assessment of COVID-19 progression and mortality is vital for the management of COVID-19 patients. We developed an automated image-based survival prediction model, called U-survival, which combines deep learning of chest CT images with the established survival analysis methodology of an elastic-net Cox survival model. In an evaluation of 383 COVID-19 positive patients from two hospitals, the prognostic bootstrap prediction performance of U-survival was significantly higher (P < 0.0001) than those of existing laboratory and image-based reference predictors both for COVID-19 progression (maximum concordance index: 91.6% [95% confidence interval 91.5, 91.7]) and for mortality (88.7% [88.6, 88.9]), and the separation between the Kaplan–Meier survival curves of patients stratified into low- and high-risk groups was largest for U-survival (P < 3 × 10–14). The results indicate that U-survival can be used to provide automated and objective prognostic predictions for the management of COVID-19 patients.

Download Full-text

Association of chemotherapy with survival in stage II colon cancer patients who received radical surgery: a retrospective cohort study

BMC Cancer ◽

10.1186/s12885-021-08057-3 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Zhihao Lv ◽

Yuqi Liang ◽

Huaxi Liu ◽

Delong Mo

Keyword(s):

Colon Cancer ◽

Overall Survival ◽

Cancer Patients ◽

Prediction Models ◽

Radical Surgery ◽

Survival Rates ◽

Stage Ii ◽

Survival Prediction ◽

Stage Ii Colon Cancer ◽

Overall Survival Rates

Abstract Background It remains controversial whether patients with Stage II colon cancer would benefit from chemotherapy after radical surgery. This study aims to assess the real effectiveness of chemotherapy in patients with stage II colon cancer undergoing radical surgery and to construct survival prediction models to predict the survival benefits of chemotherapy. Methods Data for stage II colon cancer patients with radical surgery were retrieved from the Surveillance, Epidemiology, and End Results (SEER) database. Propensity score matching (1:1) was performed according to receive or not receive chemotherapy. Competitive risk regression models were used to assess colon cancer cause-specific death (CSD) and non-colon cancer cause-specific death (NCSD). Survival prediction nomograms were constructed to predict overall survival (OS) and colon cancer cause-specific survival (CSS). The predictive abilities of the constructed models were evaluated by the concordance indexes (C-indexes) and calibration curves. Results A total of 25,110 patients were identified, 21.7% received chemotherapy, and 78.3% were without chemotherapy. A total of 10,916 patients were extracted after propensity score matching. The estimated 3-year overall survival rates of chemotherapy were 0.7% higher than non- chemotherapy. The estimated 5-year and 10-year overall survival rates of non-chemotherapy were 1.3 and 2.1% higher than chemotherapy, respectively. Survival prediction models showed good discrimination (the C-indexes between 0.582 and 0.757) and excellent calibration. Conclusions Chemotherapy improves the short-term (43 months) survival benefit of stage II colon cancer patients who received radical surgery. Survival prediction models can be used to predict OS and CSS of patients receiving chemotherapy as well as OS and CSS of patients not receiving chemotherapy and to make individualized treatment recommendations for stage II colon cancer patients who received radical surgery.

Download Full-text

Bioinformatics-Based Identification of Tumor Microenvironment-Related Prognostic Genes in Pancreatic Cancer

Frontiers in Genetics ◽

10.3389/fgene.2021.632803 ◽

2021 ◽

Vol 12 ◽

Author(s):

Shaojie Chen ◽

Feifei Huang ◽

Shangxiang Chen ◽

Yinting Chen ◽

Jiajia Li ◽

...

Keyword(s):

Pancreatic Cancer ◽

Overall Survival ◽

Regression Analysis ◽

Tumor Microenvironment ◽

Operating Characteristic ◽

Cox Regression ◽

Time Dependent ◽

Concordance Index ◽

Cox Regression Analysis ◽

Score Model

ObjectiveGrowing evidence has highlighted that the immune and stromal cells that infiltrate in pancreatic cancer microenvironment significantly influence tumor progression. However, reliable microenvironment-related prognostic gene signatures are yet to be established. The present study aimed to elucidate tumor microenvironment-related prognostic genes in pancreatic cancer.MethodsWe applied the ESTIMATE algorithm to categorize patients with pancreatic cancer from TCGA dataset into high and low immune/stromal score groups and determined their differentially expressed genes. Then, univariate and LASSO Cox regression was performed to identify overall survival-related differentially expressed genes (DEGs). And multivariate Cox regression analysis was used to screen independent prognostic genes and construct a risk score model. Finally, the performance of the risk score model was evaluated by Kaplan-Meier curve, time-dependent receiver operating characteristic and Harrell’s concordance index.ResultsThe overall survival analysis demonstrated that high immune/stromal score groups were closely associated with poor prognosis. The multivariate Cox regression analysis indicated that the signatures of four genes, including TRPC7, CXCL10, CUX2, and COL2A1, were independent prognostic factors. Subsequently, the risk prediction model constructed by those genes was superior to AJCC staging as evaluated by time-dependent receiver operating characteristic and Harrell’s concordance index, and both KRAS and TP53 mutations were closely associated with high risk scores. In addition, CXCL10 was predominantly expressed by tumor associated macrophages and its receptor CXCR3 was highly expressed in T cells at the single-cell level.ConclusionsThis study comprehensively investigated the tumor microenvironment and verified immune/stromal-related biomarkers for pancreatic cancer.

Download Full-text

Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques

10.21203/rs.3.rs-22670/v3 ◽

2020 ◽

Author(s):

Georgios Kantidakis ◽

Hein Putter ◽

Carlo Lancia ◽

Jacob de Boer ◽

Andries E Braat ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Liver Transplantation ◽

Prediction Models ◽

Machine Learning Techniques ◽

Brier Score ◽

Survival Prediction ◽

Cox Models ◽

Learning Techniques ◽

Random Survival Forest

Abstract Background: Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians.Methods: In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques.Results: Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years.Conclusion: In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables.

Download Full-text

Targeted Minimum Loss-Based Estimation of Causal Effects in Right-Censored Survival Data with Time-Dependent Covariates: Warfarin, Stroke, and Death in Atrial Fibrillation

Journal of Causal Inference ◽

10.1515/jci-2013-0001 ◽

2013 ◽

Vol 1 (2) ◽

pp. 235-254 ◽

Cited By ~ 4

Author(s):

Jordan C. Brooks ◽

Mark J. van der Laan ◽

Daniel E. Singer ◽

Alan S. Go

Keyword(s):

Survival Data ◽

Causal Effect ◽

Estimating Equation ◽

Dependent Censoring ◽

Time Dependent ◽

Causal Effects ◽

Consistent Estimation ◽

Censored Survival Data ◽

Inverse Probability ◽

Time Dependent Covariates

AbstractCausal effects in right-censored survival data can be formally defined as the difference in the marginal cumulative event probabilities under particular interventions. Conventional estimators, such as the Kaplan-Meier (KM), fail to consistently estimate these marginal parameters under dependent treatment assignment or dependent censoring. Several modern estimators have been developed that reduce bias under both dependent treatment assignment and dependent censoring by incorporating information from baseline and time-dependent covariates. In the present article we describe a recently developed targeted minimum loss-based estimation (TMLE) algorithm for general longitudinal data structures and present in detail its application in right-censored survival data with time-dependent covariates. The treatment-specific marginal cumulative event probability is defined via a series of iterated conditional expectations in a time-dependent counting process framework. The TMLE involves an initial estimator of each conditional expectation and sequentially updates these such that the resulting estimator solves the efficient influence curve estimating equation in the nonparametric statistical model. We describe the assumptions required for consistent estimation of statistical parameters and additional assumptions required for consistent estimation of the causal effect parameter. Using simulated right-censored survival data, the mean squared error, bias, and 95% confidence interval coverage probability of the TMLE is compared with those of the conventional KM and the inverse probability of censoring weight estimating equation, conventional maximum likelihood substitution estimator, and the double robustaugmented inverse probability of censoring weighted estimating equation. We conclude the article with estimation of the causal effect of warfarin medical therapy on the probability of “stroke or death” within a 1-year time frame using data from the ATRIA-1 observational cohort of persons with atrial fibrillation. Our results suggest that a fixed policy of warfarin treatment for all patients would result in 2% fewer deaths or strokes within 1-year as compared with a policy of withholding warfarin from all patients.

Download Full-text

Assessment of performance of survival prediction models for cancer prognosis

BMC Medical Research Methodology ◽

10.1186/1471-2288-12-102 ◽

2012 ◽

Vol 12 (1) ◽

Cited By ~ 37

Author(s):

Hung-Chia Chen ◽

Ralph L Kodell ◽

Kuang Fu Cheng ◽

James J Chen

Keyword(s):

Prediction Models ◽

Cancer Prognosis ◽

Survival Prediction ◽

Assessment Of Performance

Download Full-text

Time-dependent deformations of concrete columns under different construction load histories

Advances in Structural Engineering ◽

10.1177/1369433219828133 ◽

2019 ◽

Vol 22 (8) ◽

pp. 1845-1854 ◽

Cited By ~ 1

Author(s):

Dujian Zou ◽

Chengcheng Du ◽

Tiejun Liu ◽

Jun Teng ◽

Hanbin Cheng

Keyword(s):

Prediction Models ◽

Plain Concrete ◽

Concrete Columns ◽

In Situ Monitoring ◽

Time Dependent ◽

Construction Stage ◽

Inappropriate Use ◽

High Rise ◽

Multi Stage ◽

Axial Shortening

The adverse effects caused by differential axial shortening in high-rise buildings have received increasing attention with growing building height. However, the axial shortening analysis still lacks accuracy compared to the in-situ monitoring results of practical high-rise buildings during construction stage. It is imperative to identify the error sources, and the applicability of the current shortening prediction models should be test verified. In this study, 14 plain concrete columns were cast, and the multi-stage load method was applied to approximately simulate the loading history of axial concrete members during construction stage. The time-dependent deformations of loaded concrete specimens were measured, and a comparative analysis was conducted between test results and numerical prediction values. It is found that the measured deformations of multi-stage loading cases are all underestimated compared with predicted results, and this underestimation may be mainly caused by the inappropriate use of elastic modulus. It further indicates that the axial shortening analysis of high-rise buildings tends to underestimate the actual shortening value when the traditional calculation method is used. This study provides a reference for explaining the mismatch between the analytical results and the actual shortening values.

Download Full-text