scholarly journals Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach

2016 ◽  
Author(s):  
Xinyan Zhang ◽  
Yan Li ◽  
Tomi Akinyemiju ◽  
Akinyemi Ojesina ◽  
Phillip Buckhaults ◽  
...  

Heterogeneity in terms of tumor characteristics, prognosis, and survival among cancer patients has been a persistent problem for many decades. Currently, prognosis and outcome predictions are made based on clinical factors and/or by incorporating molecular profiling data. However, inaccurate prognosis and prediction may result by using only clinical or molecular information directly. One of the main shortcomings of past studies is the failure to incorporate prior biological information into the predictive model, given strong evidence of pathway-based genetic nature of cancer, i.e. the potential for oncogenes to be grouped into pathways based on biological functions such as cell survival, proliferation and metastatic dissemination. To address this problem, we propose a two-stage procedure to incorporate pathway information into the prognostic modeling using large-scale gene expression data. In the first stage, we fit all predictors within each pathway using penalized Cox model (Lasso, Ridge and Elastic Net) and Bayesian hierarchical Cox model. In the second stage, we combine the cross-validated prognostic scores of all pathways obtained in the first stage as new predictors to build an integrated prognostic model for prediction. We apply the proposed method to analyze breast cancer data from The Cancer Genome Atlas (TCGA), predicting overall survival using clinical data and gene expression profiling. The data includes ~20000 genes mapped into 109 pathways for 505 patients. The results show that the proposed approach not only improves survival prediction compared with the alternative analysis that ignores the pathway information, but also identifies significant biological pathways.

Genetics ◽  
2016 ◽  
Vol 205 (1) ◽  
pp. 89-100 ◽  
Author(s):  
Xinyan Zhang ◽  
Yan Li ◽  
Tomi Akinyemiju ◽  
Akinyemi I. Ojesina ◽  
Phillip Buckhaults ◽  
...  

2018 ◽  
Author(s):  
Soyeon Kim ◽  
Hyun Jung Park ◽  
Xiangqin Cui ◽  
Degui Zhi

ABSTRACTDNA methylation of various genomic regions plays an important role in regulating gene expression in diverse biological contexts. However, most genome-wide studies have focused on the effect of 1) methylation in cis, not in trans and 2) a single CpG, not the collective effects of multiple CpGs, on gene expression. In this study, we developed a statistical machine learning model, geneEXPLORER (geneexpression prediction by long-range epigenetic regulation), that quantifies the collective effects of both cis- and trans- methylations on gene expression. By applying geneEXPLORER to The Cancer Genome Atlas (TCGA) breast and lung cancer data, we found that most genes are affected by methylations of as much as 10Mb from promoter regions or more, and the long-range methylation explains 50% of the variation in gene expression on average, far greater than cis-methylation. The highly predictive genes are related to breast cancer, especially oncogenes and suppressor genes. Further, the predicted gene expressions could predict clinical phenotypes such as breast tumor status and estrogen receptor status (AUC=0.999, 0.94 respectively) as accurately as the measured gene expression levels. These results suggest that geneEXPLORER provides a means for accurate imputation of gene expression, which can be further used to predict clinical phenotypes.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Maryam Farhadian ◽  
Paulo J. G. Lisboa ◽  
Abbas Moghimbeigi ◽  
Jalal Poorolajal ◽  
Hossein Mahjub

In microarray studies, the number of samples is relatively small compared to the number of genes per sample. An important aspect of microarray studies is the prediction of patient survival based on their gene expression profile. This naturally calls for the use of a dimension reduction procedure together with the survival prediction model. In this study, a new method based on combining wavelet approximation coefficients and Cox regression was presented. The proposed method was compared with supervised principal component and supervised partial least squares methods. The different fitted Cox models based on supervised wavelet approximation coefficients, the top number of supervised principal components, and partial least squares components were applied to the data. The results showed that the prediction performance of the Cox model based on supervised wavelet feature extraction was superior to the supervised principal components and partial least squares components. The results suggested the possibility of developing new tools based on wavelets for the dimensionally reduction of microarray data sets in the context of survival analysis.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Xin Xu ◽  
Yida Lu ◽  
Youliang Wu ◽  
Mingliang Wang ◽  
Xiaodong Wang ◽  
...  

Abstract Background Gastric cancer (GC) has a high mortality rate and is one of the most fatal malignant tumours. Male sex has been proven as an independent risk factor for GC. This study aimed to identify immune-related genes (IRGs) associated with the prognosis of male GC. Methods RNA sequencing and clinical data were obtained from The Cancer Genome Atlas (TCGA) database. Differentially expressed IRGs between male GC and normal tissues were identified by integrated bioinformatics analysis. Univariate and multivariate Cox regression analyses were applied to screen survival-associated IRGs. Then, GC patients were separated into high- and low-risk groups based on the median risk score. Furthermore, a nomogram was constructed based on the TCGA dataset. The prognostic value of the risk signature model was evaluated by Kaplan-Meier curve, receiver operating characteristic (ROC), Harrell’s concordance index and calibration curves. In addition, the gene expression dataset from the Gene Expression Omnibus (GEO) was also downloaded for external validation. The relative proportions of 22 types of infiltrating immune cells in each male GC sample were evaluated using CIBERSORT. Results A total of 276 differentially expressed IRGs were screened, including 189 up-regulated and 87 down-regulated genes. Subsequently, a seven-IRGs signature (LCN12, CCL21, RNASE2, CGB5, NRG4, AGTR1 and NPR3) was identified to be significantly associated with the overall survival (OS) of male GC patients. Survival analysis indicated that patients in the high-risk group exhibited a poor clinical outcome. The results of multivariate analysis revealed that the risk score was an independent prognostic factor. The established nomogram could be used to evaluate the prognosis of individual male GC patients. Further analysis showed that the prognostic model had excellent predictive performance in both TCGA and validated cohorts. Besides, the results of tumour-infiltrating immune cell analysis indicated that the seven-IRGs signature could reflect the status of the tumour immune microenvironment. Conclusions Our study developed a novel seven-IRGs risk signature for individualized survival prediction of male GC patients.


10.29007/v7qj ◽  
2020 ◽  
Author(s):  
Magali Champion ◽  
Julien Chiquet ◽  
Pierre Neuvial ◽  
Mohamed Elati ◽  
François Radvanyi ◽  
...  

Comparison between tumoral and healthy cells may reveal abnormal regulation behaviors between a transcription factor and the genes it regulates, without exhibiting differential expression of the former genes. We propose a methodology for the identification of transcription factors involved in the deregulation of genes in tumoral cells. This strategy is based on the inference of a reference gene regulatory network that connects transcription factors to their downstream targets using gene expression data. Gene expression levels in tumor samples are then carefully compared to this reference network to detect deregulated target genes. A linear model is finally used to measure the ability of each transcription factor to explain these deregulations. We assess the performance of our method by numerical experiments on a public bladder cancer data set derived from the Cancer Genome Atlas project. We identify genes known for their implication in the development of specific bladder cancer subtypes as well as new potential biomarkers.


2019 ◽  
Vol 20 (2) ◽  
pp. 263 ◽  
Author(s):  
Xingyu Xu ◽  
Haixia Long ◽  
Baohang Xi ◽  
Binbin Ji ◽  
Zejun Li ◽  
...  

As a common malignant tumor disease, thyroid cancer lacks effective preventive and therapeutic drugs. Thus, it is crucial to provide an effective drug selection method for thyroid cancer patients. The connectivity map (CMAP) project provides an experimental validated strategy to repurpose and optimize cancer drugs, the rationale behind which is to select drugs to reverse the gene expression variations induced by cancer. However, it has a few limitations. Firstly, CMAP was performed on cell lines, which are usually different from human tissues. Secondly, only gene expression information was considered, while the information about gene regulations and modules/pathways was more or less ignored. In this study, we first measured comprehensively the perturbations of thyroid cancer on a patient including variations at gene expression level, gene co-expression level and gene module level. After that, we provided a drug selection pipeline to reverse the perturbations based on drug signatures derived from tissue studies. We applied the analyses pipeline to the cancer genome atlas (TCGA) thyroid cancer data consisting of 56 normal and 500 cancer samples. As a result, we obtained 812 up-regulated and 213 down-regulated genes, whose functions are significantly enriched in extracellular matrix and receptor localization to synapses. In addition, a total of 33,778 significant differentiated co-expressed gene pairs were found, which form a larger module associated with impaired immune function and low immunity. Finally, we predicted drugs and gene perturbations that could reverse the gene expression and co-expression changes incurred by the development of thyroid cancer through the Fisher’s exact test. Top predicted drugs included validated drugs like baclofen, nevirapine, glucocorticoid, formaldehyde and so on. Combining our analyses with literature mining, we inferred that the regulation of thyroid hormone secretion might be closely related to the inhibition of the proliferation of thyroid cancer cells.


2021 ◽  
Author(s):  
Xin Xu ◽  
Yida Lu ◽  
Youliang Wu ◽  
Mingliang Wang ◽  
Xiaodong Wang ◽  
...  

Abstract Background: Gastric cancer (GC) has a high mortality rate and is one of the most fatal malignant tumours. Male sex has been proven as an independent risk factor for GC. This study aimed to identify immune-related genes (IRGs) associated with the prognosis of male GC.Method: RNA sequencing and clinical data were obtained from The Cancer Genome Atlas (TCGA) database. Differentially expressed IRGs between male GC and normal tissues were identified by integrated bioinformatics analysis. Univariate and multivariate Cox regression analyses were applied to screen survival-associated IRGs. Then, GC patients were separated into high- and low-risk groups based on the median risk score. Furthermore, a nomogram was constructed based on the TCGA dataset. The prognostic value of the risk signature model was evaluated by Kaplan-Meier curve, receiver operating characteristic (ROC), Harrell’s concordance index and calibration curves. In addition, the gene expression dataset from the Gene Expression Omnibus (GEO) was also downloaded for external validation. The relative proportions of 22 types of infiltrating immune cells in each male GC sample were evaluated using CIBERSORT.Results: A total of 276 differentially expressed IRGs were screened, including 189 up-regulated and 87 down-regulated genes. Subsequently, a seven-IRGs signature (LCN12, CCL21, RNASE2, CGB5, NRG4, AGTR1 and NPR3) was identified to be significantly associated with the overall survival (OS) of male GC patients. Survival analysis indicated that patients in the high-risk group exhibited a poor clinical outcome. The results of multivariate analysis revealed that the risk score was an independent prognostic factor. The established nomogram could be used to evaluate the prognosis of individual male GC patients. Further analysis showed that the prognostic model had excellent predictive performance in both TCGA and validated cohorts. Besides, the results of tumour-infiltrating immune cell analysis indicated that the seven-IRGs signature could reflect the status of the tumour immune microenvironment.Conclusions: Our study developed a novel seven-IRGs risk signature for individualized survival prediction of male GC patients.


2021 ◽  
Author(s):  
Jing Bian ◽  
Xi Chen ◽  
Mingyan Jiang ◽  
Xinghua Gao

Abstract Liver cancer is one of the most common malignant tumors in the world, of which hepatocellular carcinoma (HCC) is the most common histological subtype. Although thousands of biomarkers related to HCC survival and prognosis have been found through database mining, the predictive effects of single-gene biomarkers are not specific enough. Therefore, we aimed to construct a pathway-related signature that could effectively forecast HCC prognosis. We obtained gene expression data and clinical patient information from The Cancer Genome Atlas database (TCGA). Univariate and multivariate Cox regression analyses were used to identify genes enriched in the E2F target gene pathway by Gene Set Enrichment Analysis. In the training set, NBN, PHF5A, CDCA8, AK2, and EXOSC8 were significantly associated with overall survival. They were validated in the test and entire groups, confirmed by Gene Expression Omnibus (GEO), and compared with two known prognostic signatures for HCC. Overall, we demonstrated a novel five-mRNA prognostic signature based on E2F targets that successfully predicted the survival of HCC patients, is independent of clinicopathological data, and displayed superior prediction performance in HCC prognosis. Our study elucidates the cell cycle mechanism in identifying patients with poor HCC prognosis. The application of our five-mRNA prognostic signature may improve risk stratification in HCC patients and existing methods for survival prediction.


2020 ◽  
Author(s):  
Yuliang Li ◽  
Zhirui Liu ◽  
Qian Wang

Abstract Background: Hepatocellular carcinoma (HCC) is a common malignant tumor with high mortality and mortality. Although advances in early diagnosis, disease management and treatment of HCC, the outcomes remain unsatisfactory. This study aimed to identify the reliable prognostic biomarkers based integrated bioinformatics analysis to predict and improve the survival of HCC patients. Methods: The gene expression or transcriptome profiles and survival of HCC were acquired from the Gene Expression Omnibus database (GEO) and the Cancer Genome Atlas (TCGA) database. Differentially expressed genes (DEGs) were screened out by the limma or edgeR package in the R software. Univariate, LASSO and multivariate Cox regression analyses were conducted to explore survival-related signature. Subsequently, a prognostic model and nomogram composed of prognostic signature were constructed for assessing overall survival (OS). Kaplan-Meier analysis, receiver operating characteristic (ROC) curve and stratified analysis were performed to confirm the prognostic performance of the prognostic model.Results: Compared with nontumor samples, 451 reliable DEGs were identified using the robust rank aggregation and overlap validation. Eleven survival-related DEGs were selected for the construction of a risk evaluation model, which could efficiently distinguish high-risk patients from low-risk patients and even be feasible in the subgroups of stages and age. Further analyses suggested the positive and independent prognostic performance of the model compared to other clinical characteristics (P< 0.05, ROC > 0.7). Finally, a prognostic nomogram composed of the model was constructed for assessing the overall survival, and Harrell’s concordance index and calibration curves demonstrated its efficient predictive performance. Conclusion: The predictive model and nomogram will contribute directly to further clinical applications in the individualized survival prediction, the improvement of treatment strategies and more accurate management for patients with HCC.


2021 ◽  
Vol 19 (01) ◽  
pp. 2140003
Author(s):  
Magali Champion ◽  
Julien Chiquet ◽  
Pierre Neuvial ◽  
Mohamed Elati ◽  
François Radvanyi ◽  
...  

In many cancers, mechanisms of gene regulation can be severely altered. Identification of deregulated genes, which do not follow the regulation processes that exist between transcription factors and their target genes, is of importance to better understand the development of the disease. We propose a methodology to detect deregulation mechanisms with a particular focus on cancer subtypes. This strategy is based on the comparison between tumoral and healthy cells. First, we use gene expression data from healthy cells to infer a reference gene regulatory network. Then, we compare it with gene expression levels in tumor samples to detect deregulated target genes. We finally measure the ability of each transcription factor to explain these deregulations. We apply our method on a public bladder cancer data set derived from The Cancer Genome Atlas project and confirm that it captures hallmarks of cancer subtypes. We also show that it enables the discovery of new potential biomarkers.


Sign in / Sign up

Export Citation Format

Share Document