scholarly journals Variable Selection Methods for Right-Censored Time-to-Event Data with High-Dimensional Covariates

2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Keivan Sadeghzadeh ◽  
Nasser Fard

Advancement in technology has led to greater accessibility of massive and complex data in many fields such as quality and reliability. The proper management and utilization of valuable data could significantly increase knowledge and reduce cost by preventive actions, whereas erroneous and misinterpreted data could lead to poor inference and decision making. On the other side, it has become more difficult to process the streaming high-dimensional time-to-event data in traditional application approaches, specifically in the presence of censored observations. This paper presents a multipurpose analytic model and practical nonparametric methods to analyze right-censored time-to-event data with high-dimensional covariates. In order to reduce redundant information and to facilitate practical interpretation, variable inefficiency in failure time is determined for the specific field of application. To investigate the performance of the proposed methods, these methods are compared with recent relevant approaches through numerical experiments and simulations.

F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 1039
Author(s):  
Xinyan Zhang ◽  
Manali Rupji ◽  
Jeanne Kowalski

We present GAC, a shiny R based tool for interactive visualization of clinical associations based on high-dimensional data. The tool provides a web-based suite to perform supervised principal component analysis (SuperPC), an approach that uses both high-dimensional data, such as gene expression, combined with clinical data to infer clinical associations. We extended the approach to address binary outcomes, in addition to continuous and time-to-event data in our package, thereby increasing the use and flexibility of SuperPC.  Additionally, the tool provides an interactive visualization for summarizing results based on a forest plot for both binary and time-to-event data.  In summary, the GAC suite of tools provide a one stop shop for conducting statistical analysis to identify and visualize the association between a clinical outcome of interest and high-dimensional data types, such as genomic data. Our GAC package has been implemented in R and is available via http://shinygispa.winship.emory.edu/GAC/. The developmental repository is available at https://github.com/manalirupji/GAC.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 1039
Author(s):  
Xinyan Zhang ◽  
Manali Rupji ◽  
Jeanne Kowalski

We present GAC, a shiny R based tool for interactive visualization of clinical associations based on high-dimensional data. The tool provides a web-based suite to perform supervised principal component analysis (SuperPC), an approach that uses both high-dimensional data, such as gene expression, combined with clinical data to infer clinical associations. We extended the approach to address binary outcomes, in addition to continuous and time-to-event data in our package, thereby increasing the use and flexibility of SuperPC.  Additionally, the tool provides an interactive visualization for summarizing results based on a forest plot for both binary and time-to-event data.  In summary, the GAC suite of tools provide a one stop shop for conducting statistical analysis to identify and visualize the association between a clinical outcome of interest and high-dimensional data types, such as genomic data. Our GAC package has been implemented in R and is available via http://shinygispa.winship.emory.edu/GAC/. The developmental repository is available at https://github.com/manalirupji/GAC.


Author(s):  
Milind A. Phadnis

Aim: To propose an updated algorithm with an extra step added to the Newton-type algorithm used in robust rank based non-parametric regression for minimizing the dispersion function associated with Wilcoxon scores in order to account for the effect of covariates. Methodology: The proposed accelerated failure time approach is aimed at incorporating right random censoring in survival data sets for low to moderate levels of censoring. The existing Newton algorithm is modified to account for the effect of one or more covariates. This is done by first applying Mantel scores to residuals obtained from a regression model, and second by minimizing the dispersion function of these scored residuals. Diagnostic check of the model fit is performed by observing the distribution of the residuals and suitable Bent scores are considered in the case of skewed residuals. To demonstrate the efficacy of this method, a simulation study is conducted to compare the power of this method under three different scenarios: non-proportional hazard, proportional and constant hazard, and proportional but non-constant hazard. Results: In most situations, this method yielded reasonable estimates of power for detecting an association of the covariate with the response as compared to popular parametric and semi-parametric approaches. The estimates of the regression coefficient obtained from this method were evaluated and were found to have low bias, low mean square error, and adequate coverage. In a real-life example pertaining to pancreatic cancer study, the proposed method performed admirably well and provided a more realistic interpretation about the effect of covariates (age and Karnofsky score) compared to a standard parametric (lognormal) model. Conclusion: In situations where there is no clear best parametric fit for time-to-event data with moderate level of censoring, the proposed method provides a robust alternative to obtain regression coefficients (both adjusted and unadjusted) with a performance comparable to that of a proportional hazards model.


F1000Research ◽  
2018 ◽  
Vol 6 ◽  
pp. 1039
Author(s):  
Xinyan Zhang ◽  
Manali Rupji ◽  
Jeanne Kowalski

We present GAC, a shiny R based tool for interactive visualization of clinical associations based on high-dimensional data. The tool provides a web-based suite to perform supervised principal component analysis (SuperPC), an approach that uses both high-dimensional data, such as gene expression, combined with clinical data to infer clinical associations. We extended the approach to address binary outcomes, in addition to continuous and time-to-event data in our package, thereby increasing the use and flexibility of SuperPC.  Additionally, the tool provides an interactive visualization for summarizing results based on a forest plot for both binary and time-to-event data.  In summary, the GAC suite of tools provide a one stop shop for conducting statistical analysis to identify and visualize the association between a clinical outcome of interest and high-dimensional data types, such as genomic data. Our GAC package has been implemented in R and is available via http://shinygispa.winship.emory.edu/GAC/. The developmental repository is available at https://github.com/manalirupji/GAC.


2016 ◽  
Vol 25 (6) ◽  
pp. 2714-2732
Author(s):  
Arindom Chakraborty

A common objective in longitudinal studies is to characterize the relationship between a longitudinal response process and a time-to-event data. Ordinal nature of the response and possible missing information on covariates add complications to the joint model. In such circumstances, some influential observations often present in the data may upset the analysis. In this paper, a joint model based on ordinal partial mixed model and an accelerated failure time model is used, to account for the repeated ordered response and time-to-event data, respectively. Here, we propose an influence function-based robust estimation method. Monte Carlo expectation maximization method-based algorithm is used for parameter estimation. A detailed simulation study has been done to evaluate the performance of the proposed method. As an application, a data on muscular dystrophy among children is used. Robust estimates are then compared with classical maximum likelihood estimates.


Sign in / Sign up

Export Citation Format

Share Document