Radiomics Analysis Using Stability Selection Supervised Principal Component Analysis for Right-censored Survival Data
AbstractRadiomics is a newly emerging field that involves the extraction of a large number of quantitative features from biomedical images through the use of data-characterization algorithms. Radiomics provides a noninvasive approach for personalized therapy decision by identifying distinctive imaging features for predicting prognosis and therapeutic response. So far, many of the published radiomics studies utilize existing out of the box algorithms to identify the prognostic markers from biomedical images that are not specific to radiomics data. T o better utilize biomedical image, we propose a novel machine learning approach, stability selection supervised principal component analysis (SSSuperPCA) that identify a set of stable features from radiomics big data coupled with dimension reduction for right censored survival outcomes. In this paper, we describe stability selection supervised principal component analysis for radiomics data with right-censored survival outcomes. The proposed approach allows us to identify a set of stable features that are highly associated with the survival outcomes, control the per-family error rate, and predict the survival in a simple yet meaningful manner. We evaluate the performance of SSSuperPCA using simulations and real data sets for non-small cell lung cancer and head and neck cancer, and compare it with other machine learning algorithms. The results demonstrate that our method has a competitive edge over other existing methods in identifying the prognostic markers from biomedical big imaging data for the prediction of right-censored survival outcomes. An R package SSSuperPCA is available at the website: http://web.hku.hk/∼herbpang/SSSuperPCA.html