Survival Risk Prediction Using High-Dimensional Molecular Data

An overview of techniques for linking high-dimensional molecular data to time-to-event endpoints by risk prediction models

Biometrical Journal ◽

10.1002/bimj.201000152 ◽

2011 ◽

Vol 53 (2) ◽

pp. 170-189 ◽

Cited By ~ 11

Author(s):

Harald Binder ◽

Christine Porzelius ◽

Martin Schumacher

Keyword(s):

Risk Prediction ◽

Prediction Models ◽

Molecular Data ◽

High Dimensional ◽

Time To Event ◽

Risk Prediction Models

Download Full-text

An evaluation of resampling methods for assessment of survival risk prediction in high-dimensional settings

Statistics in Medicine ◽

10.1002/sim.4106 ◽

2010 ◽

Vol 30 (6) ◽

pp. 642-653 ◽

Cited By ~ 21

Author(s):

Jyothi Subramanian ◽

Richard Simon

Keyword(s):

Risk Prediction ◽

High Dimensional ◽

Resampling Methods ◽

Survival Risk

Download Full-text

Personalized dynamic prediction of death according to tumour progression and high-dimensional genetic factors: Meta-analysis with a joint model

Statistical Methods in Medical Research ◽

10.1177/0962280216688032 ◽

2017 ◽

Vol 27 (9) ◽

pp. 2842-2858 ◽

Cited By ~ 30

Author(s):

Takeshi Emura ◽

Masahiro Nakatochi ◽

Shigeyuki Matsui ◽

Hirofumi Michimae ◽

Virginie Rondeau

Keyword(s):

Risk Prediction ◽

Genetic Factors ◽

Tumour Progression ◽

Meta Analysis ◽

Cox Proportional Hazards Model ◽

High Dimensional ◽

Survival Prediction ◽

Prediction Formula ◽

Dynamic Prediction ◽

Personalized Risk

Developing a personalized risk prediction model of death is fundamental for improving patient care and touches on the realm of personalized medicine. The increasing availability of genomic information and large-scale meta-analytic data sets for clinicians has motivated the extension of traditional survival prediction based on the Cox proportional hazards model. The aim of our paper is to develop a personalized risk prediction formula for death according to genetic factors and dynamic tumour progression status based on meta-analytic data. To this end, we extend the existing joint frailty-copula model to a model allowing for high-dimensional genetic factors. In addition, we propose a dynamic prediction formula to predict death given tumour progression events possibly occurring after treatment or surgery. For clinical use, we implement the computation software of the prediction formula in the joint.Cox R package. We also develop a tool to validate the performance of the prediction formula by assessing the prediction error. We illustrate the method with the meta-analysis of individual patient data on ovarian cancer patients.

Download Full-text

Incorporating pathway information into boosting estimation of high-dimensional risk prediction models

BMC Bioinformatics ◽

10.1186/1471-2105-10-18 ◽

2009 ◽

Vol 10 (1) ◽

Cited By ~ 45

Author(s):

Harald Binder ◽

Martin Schumacher

Keyword(s):

Risk Prediction ◽

Prediction Models ◽

High Dimensional ◽

Risk Prediction Models ◽

Pathway Information

Download Full-text

Survival Risk Prediction Models of Gliomas Based on IDH and 1p/19q

Journal of Cancer ◽

10.7150/jca.43805 ◽

2020 ◽

Vol 11 (15) ◽

pp. 4297-4307

Author(s):

Han Zou ◽

Chang Li ◽

Siyi Wanggou ◽

Xuejun Li

Keyword(s):

Risk Prediction ◽

Prediction Models ◽

Risk Prediction Models ◽

Survival Risk

Download Full-text

Risk Prediction of Cardiovascular Events by Exploration of Molecular Data with Explainable Artificial Intelligence

International Journal of Molecular Sciences ◽

10.3390/ijms221910291 ◽

2021 ◽

Vol 22 (19) ◽

pp. 10291

Author(s):

Annie M. Westerlund ◽

Johann S. Hawe ◽

Matthias Heinig ◽

Heribert Schunkert

Keyword(s):

Artificial Intelligence ◽

Risk Prediction ◽

Cardiovascular Events ◽

Regulatory Networks ◽

Recurrent Events ◽

Large Scale ◽

Treatment Strategies ◽

Molecular Data ◽

Holistic View ◽

Prediction And Prevention

Cardiovascular diseases (CVD) annually take almost 18 million lives worldwide. Most lethal events occur months or years after the initial presentation. Indeed, many patients experience repeated complications or require multiple interventions (recurrent events). Apart from affecting the individual, this leads to high medical costs for society. Personalized treatment strategies aiming at prediction and prevention of recurrent events rely on early diagnosis and precise prognosis. Complementing the traditional environmental and clinical risk factors, multi-omics data provide a holistic view of the patient and disease progression, enabling studies to probe novel angles in risk stratification. Specifically, predictive molecular markers allow insights into regulatory networks, pathways, and mechanisms underlying disease. Moreover, artificial intelligence (AI) represents a powerful, yet adaptive, framework able to recognize complex patterns in large-scale clinical and molecular data with the potential to improve risk prediction. Here, we review the most recent advances in risk prediction of recurrent cardiovascular events, and discuss the value of molecular data and biomarkers for understanding patient risk in a systems biology context. Finally, we introduce explainable AI which may improve clinical decision systems by making predictions transparent to the medical practitioner.

Download Full-text

Multi-kernel linear mixed model with adaptive lasso for prediction analysis on high-dimensional multi-omics data

Bioinformatics ◽

10.1093/bioinformatics/btz822 ◽

2019 ◽

Vol 36 (6) ◽

pp. 1785-1794

Author(s):

Jun Li ◽

Qing Lu ◽

Yalu Wen

Keyword(s):

Risk Prediction ◽

Mixed Model ◽

Linear Mixed Model ◽

R Package ◽

Kernel Functions ◽

Adaptive Lasso ◽

Supplementary Information ◽

High Dimensional ◽

Omics Data ◽

Modeling Framework

Abstract Motivation The use of human genome discoveries and other established factors to build an accurate risk prediction model is an essential step toward precision medicine. While multi-layer high-dimensional omics data provide unprecedented data resources for prediction studies, their corresponding analytical methods are much less developed. Results We present a multi-kernel penalized linear mixed model with adaptive lasso (MKpLMM), a predictive modeling framework that extends the standard linear mixed models widely used in genomic risk prediction, for multi-omics data analysis. MKpLMM can capture not only the predictive effects from each layer of omics data but also their interactions via using multiple kernel functions. It adopts a data-driven approach to select predictive regions as well as predictive layers of omics data, and achieves robust selection performance. Through extensive simulation studies, the analyses of PET-imaging outcomes from the Alzheimer’s Disease Neuroimaging Initiative study, and the analyses of 64 drug responses, we demonstrate that MKpLMM consistently outperforms competing methods in phenotype prediction. Availability and implementation The R-package is available at https://github.com/YaluWen/OmicPred. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Improvement Screening for Ultra-High Dimensional Data with Censored Survival Outcomes and Varying Coefficients

The International Journal of Biostatistics ◽

10.1515/ijb-2017-0024 ◽

2017 ◽

Vol 13 (1) ◽

Cited By ~ 3

Author(s):

Mu Yue ◽

Jialiang Li

Keyword(s):

Risk Prediction ◽

Survival Data ◽

Proportional Hazards ◽

Cox Proportional Hazards ◽

Added Value ◽

Patient Treatment ◽

High Dimensional ◽

Varying Coefficients ◽

Net Reclassification Improvement ◽

Cox Proportional Hazards Models

AbstractMotivated by risk prediction studies with ultra-high dimensional bio markers, we propose a novel improvement screening methodology. Accurate risk prediction can be quite useful for patient treatment selection, prevention strategy or disease management in evidence-based medicine. The question of how to choose new markers in addition to the conventional ones is especially important. In the past decade, a number of new measures for quantifying the added value from the new markers were proposed, among which the integrated discrimination improvement (IDI) and net reclassification improvement (NRI) stand out. Meanwhile, C-statistics are routinely used to quantify the capacity of the estimated risk score in discriminating among subjects with different event times. In this paper, we will examine these improvement statistics as well as the norm-based approach for evaluating the incremental values of new markers and compare these four measures by analyzing ultra-high dimensional censored survival data. In particular, we consider Cox proportional hazards models with varying coefficients. All measures perform very well in simulations and we illustrate our methods in an application to a lung cancer study.

Download Full-text

A statistical model for survival risk prediction in patients with advanced hepatocellular carcinoma undergoing sorafenib treatment

Journal of Hepatology ◽

10.1016/s0168-8278(18)30607-x ◽

2018 ◽

Vol 68 ◽

pp. S197-S198

Author(s):

S. Berhane ◽

R. Fox ◽

S.L. Chan ◽

T. Yau ◽

P. Johnson

Keyword(s):

Hepatocellular Carcinoma ◽

Statistical Model ◽

Risk Prediction ◽

Advanced Hepatocellular Carcinoma ◽

Sorafenib Treatment ◽

Survival Risk

Download Full-text

IPF-LASSO: Integrative L1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2017/7691937 ◽

2017 ◽

Vol 2017 ◽

pp. 1-14 ◽

Cited By ~ 25

Author(s):

Anne-Laure Boulesteix ◽

Riccardo De Bin ◽

Xiaoyu Jiang ◽

Mathias Fuchs

Keyword(s):

Cross Validation ◽

Real Life ◽

Penalized Regression ◽

R Package ◽

Molecular Data ◽

Regression Method ◽

Data Driven ◽

High Dimensional ◽

Omics Data ◽

Simulation Studies

As modern biotechnologies advance, it has become increasingly frequent that different modalities of high-dimensional molecular data (termed “omics” data in this paper), such as gene expression, methylation, and copy number, are collected from the same patient cohort to predict the clinical outcome. While prediction based on omics data has been widely studied in the last fifteen years, little has been done in the statistical literature on the integration of multiple omics modalities to select a subset of variables for prediction, which is a critical task in personalized medicine. In this paper, we propose a simple penalized regression method to address this problem by assigning different penalty factors to different data modalities for feature selection and prediction. The penalty factors can be chosen in a fully data-driven fashion by cross-validation or by taking practical considerations into account. In simulation studies, we compare the prediction performance of our approach, called IPF-LASSO (Integrative LASSO with Penalty Factors) and implemented in the R package ipflasso, with the standard LASSO and sparse group LASSO. The use of IPF-LASSO is also illustrated through applications to two real-life cancer datasets. All data and codes are available on the companion website to ensure reproducibility.

Download Full-text