scholarly journals COSMONET: An R Package for Survival Analysis Using Screening-Network Methods

Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3262
Author(s):  
Antonella Iuliano  ◽  
Annalisa Occhipinti  ◽  
Claudia Angelini  ◽  
Italia De De Feis  ◽  
Pietro Liò 

Identifying relevant genomic features that can act as prognostic markers for building predictive survival models is one of the central themes in medical research, affecting the future of personalized medicine and omics technologies. However, the high dimension of genome-wide omic data, the strong correlation among the features, and the low sample size significantly increase the complexity of cancer survival analysis, demanding the development of specific statistical methods and software. Here, we present a novel R package, COSMONET (COx Survival Methods based On NETworks), that provides a complete workflow from the pre-processing of omics data to the selection of gene signatures and prediction of survival outcomes. In particular, COSMONET implements (i) three different screening approaches to reduce the initial dimension of the data from a high-dimensional space p to a moderate scale d, (ii) a network-penalized Cox regression algorithm to identify the gene signature, (iii) several approaches to determine an optimal cut-off on the prognostic index (PI) to separate high- and low-risk patients, and (iv) a prediction step for patients’ risk class based on the evaluation of PIs. Moreover, COSMONET provides functions for data pre-processing, visualization, survival prediction, and gene enrichment analysis. We illustrate COSMONET through a step-by-step R vignette using two cancer datasets.

2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Jie Liu ◽  
Shiqiang Hou ◽  
Jinyi Wang ◽  
Zhengjun Chai ◽  
Xuan Hong ◽  
...  

Background. Lung adenocarcinoma (LUAD), a major and fatal subtype of lung cancer, caused lots of mortalities and showed different outcomes in prognosis. This study was to assess key genes and to develop a prognostic signature for the patient therapy with LUAD. Method. RNA expression profile and clinical data from 522 LUAD patients were accessed and downloaded from the Cancer Genome Atlas (TCGA) database. Differentially expressed genes (DEGs) were extracted and analyzed between normal tissues and LUAD samples. Then, a 14-DEG signature was developed and identified for the survival prediction in LUAD patients by means of univariate and multivariate Cox regression analyses. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed to predict the potential biological functions and pathways of these DEGs. Results. Twenty-two out of 5924 DEGs in the TCGA dataset were screened and associated with the overall survival (OS) of LUAD patients. 14CID="C008" value=" "DEGs were finally selected and included in our development and validation model by risk score analysis. The ROC analysis indicated that the specificity and sensitivity of this profile signature were high. Further functional enrichment analyses indicated that these DEGs might regulate genes that affect the function of release of sequestered calcium ion into cytosol and pathways that associated with vibrio cholerae infection. Conclusion. Our study developed a novel 14-DEG signature providing more efficient and persuasive prognostic information beyond conventional clinicopathological factors for survival prediction of LUAD patients.


Author(s):  
Tianhua Li ◽  
Yiguang Chen ◽  
Yongjian Chen ◽  
Guangjie Liu ◽  
Shisheng Zou ◽  
...  

Glioma accounts for the highest proportion of primary intracranial malignant tumors. Microenvironment enormously influences the process of glioma progression. Our study is to establish an individualized prognostic nomogram for glioma patients with microenvironment signature. Glioma samples of Chinese Glioma Genome Atlas (CGGA) were grouped by the immune and stromal score based on ESTIMATE algorithm. Microenvironment-related genes (MRGs) in glioma were analyzed by R. To determine the best prognostic correlation genes, univariate and multivariate Cox regression analysis were used to analyze MRGs. Use the selected genes (CHI3L1, SOCS3, SLC47A2, COL3A1, SRPX2 and SERPINA3), we established the prognostic risk score model (microenvironment signature) and validated it. Gene Set Enrichment Analysis (GSEA) showed that the high-risk group was mainly enriched in immune and stromal function KEGG pathways. Finally, the nomogram was constructed and evaluated. The receiver operating characteristic (ROC) curve, Calibration plots and decision curve analysis (DCA) of training and validation set indicated the excellent predictive performance of nomogram. In conclusion, the 6-gene microenvironment signature can not only provide directions for the basic research of glioma, but also can be included as an independent prognostic index in nomogram for individual prediction to guide clinical treatment.


2020 ◽  
Author(s):  
Yeping Lina Qiu ◽  
Hong Zheng ◽  
Arnout Devos ◽  
Olivier Gevaert

AbstractRNA sequencing has emerged as a promising approach in cancer prognosis as sequencing data becomes more easily and affordably accessible. However, it remains challenging to build good predictive models especially when the sample size is limited and the number of features is high, which is a common situation in biomedical settings. To address these limitations, we propose a meta-learning framework based on neural networks for survival analysis and evaluate it in a genomic cancer research setting. We demonstrate that, compared to regular transfer-learning, meta-learning is a significantly more effective paradigm to leverage high-dimensional data that is relevant but not directly related to the problem of interest. Specifically, meta-learning explicitly constructs a model, from abundant data of relevant tasks, to learn a new task with few samples effectively. For the application of predicting cancer survival outcome, we also show that the meta-learning framework with a few samples is able to achieve competitive performance with learning from scratch with a significantly larger number of samples. Finally, we demonstrate that the meta-learning model implicitly prioritizes genes based on their contribution to survival prediction and allows us to identify important pathways in cancer.


Cancers ◽  
2019 ◽  
Vol 11 (11) ◽  
pp. 1732
Author(s):  
Chenjin Ma ◽  
Yuan Xue ◽  
Shuangge Ma

In cancer research, population-based survival analysis has played an important role. In this article, we conduct survival analysis on patients with brain tumors using the SEER (Surveillance, Epidemiology, and End Results) database from the NCI (National Cancer Institute). It has been recognized that cancer survival models have spatial and temporal variations which are caused by multiple factors, but such variations are usually not “abrupt” (that is, they should be smooth). As such, spatially and temporally pooling all data and analyzing each spatial/temporal point separately are either inappropriate or ineffective. In this article, we develop and implement a spatial- and temporal-smoothing technique, which can effectively accommodate spatial/temporal variations and realize information borrowing across spatial/temporal points. Simulation demonstrates effectiveness of the proposed approach in improving estimation. Data on a total of 123,571 patients with brain tumors diagnosed between 1911 and 2010 from 16 SEER sites is analyzed. Findings different from separate estimation and simple pooling are made. Overall, this study may provide a practically useful way for modeling the survival of brain tumor (and other cancers) using population data.


2021 ◽  
Vol 12 ◽  
Author(s):  
Honghao Cao ◽  
Hang Tong ◽  
Junlong Zhu ◽  
Chenchen Xie ◽  
Zijia Qin ◽  
...  

BackgroundThe prognosis of renal cell carcinoma (RCC) varies greatly among different risk groups, and the traditional indicators have limited effect in the identification of risk grade in patients with RCC. The purpose of our study is to explore a glycolysis-based long non-coding RNAs (lncRNAs) signature and verify its potential clinical significance in prognostic prediction of RCC patients.MethodsIn this study, RNA data and clinical information were downloaded from The Cancer Genome Atlas (TCGA) database. Univariate and multivariate cox regression displayed six significantly related lncRNAs (AC124854.1, AC078778.1, EMX2OS, DLGAP1-AS2, AC084876.1, and AC026401.3) which were utilized in construction of risk score by a formula. The accuracy of risk score was verified by a series of statistical methods such as receiver operating characteristic (ROC) curves, nomogram and Kaplan-Meier curves. Its potential clinical significance was excavated by gene enrichment analysis.ResultsKaplan-Meier curves and ROC curves showed reliability of the risk score to predict the prognosis of RCC patients. Stratification analysis indicated that the risk score was independent predictor compare to other traditional clinical parameters. The clinical nomogram showed highly rigorous with index of 0.73 and precisely predicted 1-, 3-, and 5-year survival time of RCC patients. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene set enrichment analysis (GSEA) depicted the top ten correlated pathways in both high-risk group and low-risk group. There are 6 lncRNAs and 25 related mRNAs including 36 lncRNA-mRNA links in lncRNA-mRNA co-expression network.ConclusionThis research demonstrated that glycolysis-based lncRNAs possessed an important value in survival prediction of RCC patients, which would be a potential target for future treatment.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Bin Wang ◽  
Fachun Tong ◽  
Chengxi Zhai ◽  
Long Wang ◽  
Yunzan Liu ◽  
...  

Background. Aging is an essential risk factor for cancer. However, aging-related genes (ARGs) have not been comprehensively analyzed in bladder cancer (BC). Therefore, the study is aimed at derivating a risk stratification system for BC patients based on ARGs. Methods. Public databases were used to acquire ARGs sets, transcriptome files, and clinical data. The “limma” package was then used to screen for differential ARGs while also using univariate Cox regression analysis to explore for prognostic ARGs. The “ConsensusClusterPlus” package was used to perform aging patterns in BC patients based on the above prognostic ARGs. Subsequently, aging patterns were investigated in survival prediction, mutation landscape, immunotherapy, immunological checkpoints, and immune microenvironment. We likewise utilized gene enrichment analysis to explore the biological functions that were behind the findings. To construct a risk signature and nonogram for prognostic prediction, we used LASSO and Cox regression analysis based on differential genes in aging patterns. In addition, we plotted a nomogram and validate the accuracy of the risk signature in GEO and TCGA cohorts. We explored the possible biological mechanism using GSEA analysis and preliminarily identified a hub gene using PPI network. Finally, we validated the expression of hub gene in BC cell lines. Results. We screened 84 downregulated ARGs, 74 upregulated ARGs, and 32 prognostic ARGs in the human aging genome resource. The aging patterns based on prognostic genes had excellent survival prediction ( p < 0.001 ) and discriminatory ability in 405 BC patients. In addition, we found no significant differences in aging patterns in mutation analysis, which were all characterized by TP53, TTN, and KMT2D mutations. It is worth noting that cluster B in the aging patterns has a better response to immunotherapy and a more active immune microenvironment ( p < 0.05 ). In addition, gene enrichment analysis showed that aging patterns may be related to biological processes such as Staphylococcus aureus infection, phagosome, and cytokine-cytokine receptor interaction. Subsequently, we constructed a risk signature based on 16 differential genes from different aging patterns and had good survival prediction ability in both GEO and TCGA cohort. Specifically, survival analysis revealed a significantly shorter survival time in the high-risk group than in the low-risk group (TCGA and GEO, p < 0.001 ). In addition, AUC values in the ROC analysis predicted 1, 3, and 5 years in TCGA cohort that are 0.713, 0.714, and 0.738, respectively. AUC values predicted 1, 3, and 5 years in GEO cohort that are 0.606, 0.663, and 0.718, respectively. There is no doubt that risk score was an independent prognostic factor from results of multivariate Cox regression analysis in BC patients ( p < 0.001 ). There were also significant differences in immune cell infiltration, immune checkpoint, and immune score between the two groups ( p < 0.05 ), but it should not be ignored that the correlation with the HLA expression was weak. Finally, we identified and validated CLIC3 as a hub gene that may be involved in the Wnt signaling pathway, etc. Conclusion. We provided robust evidences that aging patterns based on ARGs can guide targeted therapy and survival prediction in BC patients.


2020 ◽  
Author(s):  
Luis Andre Vale-Silva ◽  
Karl Rohr

The age of precision medicine demands powerful computational techniques to handle high-dimensional patient data. We present MultiSurv, a multimodal deep learning method for long-term pan-cancer survival prediction. MultiSurv is composed of three main modules. A feature representation module includes a dedicated submodel for each input data modality. A data fusion layer aggregates the multimodal representations. Finally, a prediction submodel yields conditional survival probabilities for a predefined set of follow-up time intervals. We trained MultiSurv on clinical, imaging, and four different high-dimensional omics data modalities from patients diagnosed with one of 33 different cancer types. We evaluated unimodal input configurations against several previous methods and different multimodal data combinations. MultiSurv achieved the best results according to different time-dependent metrics and delivered highly accurate long-term patient survival curves. The best performance was obtained when combining clinical information with either gene expression or DNA methylation data, depending on the evaluation metric. Additionally, MultiSurv can handle missing data, including missing values and complete data modalities. Interestingly, for unimodal data we found that simpler modeling approaches, including the classical Cox proportional hazards method, can achieve results rivaling those of more complex methods for certain data modalities. We also show how the learned feature representations of MultiSurv can be used to visualize relationships between cancer types and individual patients, after embedding into a low-dimensional space.


2020 ◽  
Author(s):  
Qiang Zhang ◽  
Qiongyun Chen ◽  
Yinyin Lv ◽  
Xuan Dong ◽  
Xiaoqing Huang ◽  
...  

Abstract Background The global incidence of gastric cancer (GC) ranks the fourth among cancers and its 5-year survival is less than 25%. LncRNAs are vital regulators involved in pathological processes of cancer. It is urgent to screen the prognostic lncRNA in GC. Method Expression file and clinical data of GC were downloaded from TCGA. Differentially expressed lncRNAs were calculated by edger R package, followed by the prognosis analysis. COX analysis was conducted to compute the independent factor of GC. Potential signaling pathways that the screened lncRNAs enriched in were evaluated by gene set enrichment analysis (GSEA). At last, Pearson analysis was conducted to predict the possible mechanism of lncRNA in GC process. Result ENSG00000224363 was an unfavorable prognostic factor to OS (overall survival) and DFS (disease-free survival) of GC as COX regression analyzed. GSEA analysis indicated that ENSG00000224363 may regulate cell cycle, apoptosis and autophagy of GC cells. Conclusion LncRNA ENSG00000224363 is overexpressed in GC, serving as an independent unfavorable prognostic factor.


2021 ◽  
Author(s):  
Zhixing Wang ◽  
Fudan Qiu ◽  
Peilin Shen

Abstract Background: Ferroptosis is a new form of regulated cell death (RCD) that plays a crucial role in the genesis and prognosis of tumor. Nevertheless, the relationship between ferroptosis and the prognosis of thyroid carcinoma (THCA) remains unclear and needs to be explored. Methods: By analyzing data from the THCA cohort in the TCGA database, ferroptosis-related differentially expressed genes (DEGs) with prognostic value were identified, which were used to establish a prognostic signature based on Lasso-penalized Cox regression analysis. Then, the model was testified with Kaplan-Meier survival, Cox regression and receiver operating characteristic (ROC) analyses based on overall survival (OS). Finally, DEGs between the low-risk and high-risk groups were identified and used to conduct GO enrichment analysis, KEGG pathways analysis and immune infiltration analysis.Results: A 6-gene signature was constructed which including DPP4, GPX4, GSS, HMGCR, TFRC and PGD. The area under the curve (AUC) were 0.890 (1 year), 0.863 (2 years) and 0.883 (3 years) which validated the prominent predictive capacity of the model. Multivariate Cox regression certified the model as a prognostic-related independent predictor for OS.Conclusion: In this study, we established an innovative prognostic signature of 6 ferroptosis-related genes which can be as a prognostic-related independent predictor for OS in THCA, while the potential mechanisms was still unclear and needed further exploration.


2021 ◽  
Author(s):  
Dan-Dan Wang ◽  
Wen-Xiu Xu ◽  
Wen-Quan Chen ◽  
Su-Jin Yang ◽  
Jian Zhang ◽  
...  

Abstract Background: Tissue inhibitor of metalloproteinase-2 (TIMP2), an endogenous inhibitor of matrix metalloproteinases, has been disclosed to participate in the development and carcinogenesis of multiple malignancies. However, the prognosis of TIMP2 in different cancers and its correlation with tumor microenvironment and immunity have not been clarified.Methods: In this study, we conducted a comprehensive bioinformatics analysis to evaluate the prognostic and therapeutic value of TIMP2 in cancer patients by utilizing a series of databases, including ONCOMINE, GEPIA, cBioPortal, GeneMANIA, Metascape, and Sangerbox online tool. The expression of TIMP2 in different cancers were analyzed by Oncomine, TCGA and GTEx databases and mutation status of TIMP2 in cancers was then verified using cBioportal database. The protein-protein interaction (PPI) network of the TIMP family was exhibited by GeneMANIA. The prognosis of TIMP2 in cancers was performed though GEPIA database and cox regression. Additionally, the correlations between TIMP2 expression and immunity (immune cells, gene markers of immune cells, TMB, MSI, and neoantigen) were explored using Sangerbox online tool.Results: The transcriptional level of TIMP2 in most cancerous tissues were significantly elevated. Survival analysis revealed that elevated expression of TIMP2 was associated with unfavorable survival outcome in multiple cancers. Enrichment analysis demonstrated the possible mechanisms of TIMPs and their associated genes mainly involved in pathways including extracellular matrix (ECM) regulators, degradation of ECM and ECM disassembly, and several other signaling pathways. Conclusions: Our findings systematically dissected that TIMP2 was a potential prognostic maker in various cancers and use the inhibitor of TIMP2 may be an effective strategy for cancer therapy to improve the poor cancer survival and prognostic accuracy, but concrete mechanisms need to be validated by subsequent experiments.


Sign in / Sign up

Export Citation Format

Share Document