A novel network controllability algorithm to target personalized driver genes for discovering combinational drugs of individual cancer patient

Mapping Intimacies ◽

10.1101/571620 ◽

2019 ◽

Author(s):

Wei-Feng Guo ◽

Shao-Wu Zhang ◽

Tao Zeng ◽

Luonan Chen

Keyword(s):

Cancer Patient ◽

Side Effect ◽

Target Genes ◽

Driver Gene ◽

Omics Data ◽

Driver Genes ◽

Data Set ◽

Cancer Data ◽

Personalized Risk ◽

Network Controllability

AbstractTreating cancer in precision medicine, it is important to identify the personalized combinational drugs under consideration of the individual heterogeneity. Many bioinformatics tools for the personalized driver genes identification have presented promising clues in determining candidate personalized drug targets for the personalized drugs discovery. However, it has not been studied how to fill the gap between personalized driver genes identification and personalized combinational drugs discovery. In this work, we developed a novel algorithm of structure network Controllability-based Personalized driver Genes and combinational Drug identification (CPGD), aiming to mine the personalized driver genes and identify the combinational drugs of an individual cancer patient. On two benchmark cancer datasets, the performance of CPGD for predicting the clinical efficacious combinational drugs is superior to that of other state-of-the-art driver gene-focus algorithms in terms of precision accuracy. In particular, by quantifying and referring the relationships between target genes of pairwise combinatorial drugs and disease module genes on breast cancer data set, CPGD can significantly divide patients into the discriminative high-risk and low-risk groups for risk asessment in combination therapy. In addition, CPGD can further enhance cancer subtyping by providing computationally personalized side effect signatures for individual patients. Collectively, CPGD provided a new and effecient bioinformatics tool from structure network controllability perspective for discovering personalized combinational drugs with personalized side effect consideration, so as to effectively support personalized risk assessement and disease subtyping.SignificanceIt is quite challenging to predict personalized combinational drugs rather than patient-cohort‘s drugs based on cancer omics data. In this work, a novel structure network Controllability-based algorithm (CPGD) from feedback vertex sets control perspective was developed, for discovering efficacious combinational drugs of an individual cancer patient by targeting the personalized driver genes. The CPGD contains three methodological advances by exploring more precise mathematical models on high-throughput personalized multi-omics data. The first is that a proper network structure is constructed to characterize the gene regulatory mechanism of an individual patient. The second is that considering the weight information of network edges/relations improves the performance for predicting clinical efficacious combinational drugs compared with other drivers-focus methods. And the third is that proper evaluation metrics for personalized combinational drugs prioritization, personalized risk assessment and disease subtyping are designed when evaluating the performance of CPGD.

Download Full-text

Identification of Deregulated Transcription Factors Involved in Specific Bladder Cancer Subtypes

10.29007/v7qj ◽

2020 ◽

Author(s):

Magali Champion ◽

Julien Chiquet ◽

Pierre Neuvial ◽

Mohamed Elati ◽

François Radvanyi ◽

...

Keyword(s):

Gene Expression ◽

Bladder Cancer ◽

Transcription Factor ◽

Transcription Factors ◽

Target Genes ◽

The Cancer Genome Atlas ◽

Reference Network ◽

Data Set ◽

Cancer Subtypes ◽

Cancer Data

Comparison between tumoral and healthy cells may reveal abnormal regulation behaviors between a transcription factor and the genes it regulates, without exhibiting differential expression of the former genes. We propose a methodology for the identification of transcription factors involved in the deregulation of genes in tumoral cells. This strategy is based on the inference of a reference gene regulatory network that connects transcription factors to their downstream targets using gene expression data. Gene expression levels in tumor samples are then carefully compared to this reference network to detect deregulated target genes. A linear model is finally used to measure the ability of each transcription factor to explain these deregulations. We assess the performance of our method by numerical experiments on a public bladder cancer data set derived from the Cancer Genome Atlas project. We identify genes known for their implication in the development of specific bladder cancer subtypes as well as new potential biomarkers.

Download Full-text

Module analysis captures pancancer genetically and epigenetically deregulated cancer driver genes for smoking and antiviral response

10.1101/216754 ◽

2017 ◽

Author(s):

Magali Champion ◽

Kevin Brennan ◽

Tom Croonenborghs ◽

Andrew J. Gentles ◽

Nathalie Pochet ◽

...

Keyword(s):

Cell Line ◽

Target Genes ◽

Lung Cancer Cell Line ◽

Driver Gene ◽

List Type ◽

Multiple Sources ◽

Driver Genes ◽

Cancer Driver ◽

Main Challenge ◽

Cancer Driver Genes

AbstractThe availability of increasing volumes of multi-omics profiles across many cancers promises to improve our understanding of the regulatory mechanisms underlying cancer. The main challenge is to integrate these multiple levels of omics profiles and especially to analyze them across many cancers. Here we present AMARETTO, an algorithm that addresses both challenges in three steps. First, AMARETTO identifies potential cancer driver genes through integration of copy number, DNA methylation and gene expression data. Then AMARETTO connects these driver genes with co-expressed target genes that they control, defined as regulatory modules. Thirdly, we connect AMARETTO modules identified from different cancer sites into a pancancer network to identify cancer driver genes. Here we applied AMARETTO in a pancancer study comprising eleven cancer sites and confirmed that AMARETTO captures hallmarks of cancer. We also demonstrated that AMARETTO enables the identification of novel pancancer driver genes. In particular, our analysis led to the identification of pancancer driver genes of smoking-induced cancers and ‘antiviral’ interferon-modulated innate immune response.Software availabilityAMARETTO is available as an R package athttps://bitbucket.org/gevaertlab/pancanceramarettoHighlightsWe present an algorithm for pancancer identification of cancer driver genes based on multiomics data fusionGPX2 is a novel driver gene in smoking induced cancers and validated using knockdown of GPX2 in the A549 cell line.OAS2 is a novel driver gene defining cancers with an antiviral signature supported by increased infiltration of tumor-associated macrophages.Research in contextWe present an algorithm that combines multiple sources of molecular data to identify novel genes that are involved in cancer development. We applied this algorithm on multiple cancers in a combined fashion and identified a network of pancancer driver genes. We highlighted two genes in detail GPX2 and OAS2. We showed that GPX2 is an important cancer gene in smoking induced cancers, and validated our predictions using experimental data where GPX2 was inactivated in a lung cancer cell line. Similarly we showed that OAS2 is an important cancer driver gene in cancers that show an antiviral signature.

Download Full-text

Improving existing analysis pipeline to identify and analyze cancer driver genes using multi-omics data

Scientific Reports ◽

10.1038/s41598-020-77318-1 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Quang-Huy Nguyen ◽

Duc-Hau Le

Keyword(s):

Drug Targets ◽

Expression Profiles ◽

Prognostic Index ◽

Enrichment Analysis ◽

Tumor Stage ◽

Clinical Feature ◽

Driver Gene ◽

Omics Data ◽

Analysis Pipeline ◽

Driver Genes

AbstractThe cumulative of genes carrying mutations is vital for the establishment and development of cancer. However, this driver gene exploring research line has selected and used types of tools and models of analysis unsystematically and discretely. Also, the previous studies may have neglected low-frequency drivers and seldom predicted subgroup specificities of identified driver genes. In this study, we presented an improved driver gene identification and analysis pipeline that comprises the four most widely focused analyses for driver genes: enrichment analysis, clinical feature association with expression profiles of identified driver genes as well as with their functional modules, and patient stratification by existing advanced computational tools integrating multi-omics data. The improved pipeline's general usability was demonstrated straightforwardly for breast cancer, validated by some independent databases. Accordingly, 31 validated driver genes, including four novel ones, were discovered. Subsequently, we detected cancer-related significantly enriched gene ontology terms and pathways, probable drug targets, two co-expressed modules associated significantly with several clinical features, such as number of positive lymph nodes, Nottingham prognostic index, and tumor stage, and two biologically distinct groups of BRCA patients. Data and source code of the case study can be downloaded at https://github.com/hauldhut/drivergene.

Download Full-text

Evaluation of integrative clustering methods for the analysis of multi-omics data

Briefings in Bioinformatics ◽

10.1093/bib/bbz015 ◽

2019 ◽

Vol 21 (2) ◽

pp. 541-552 ◽

Cited By ~ 9

Author(s):

Cécile Chauvel ◽

Alexei Novoloaca ◽

Pierre Veyre ◽

Frédéric Reynier ◽

Jérémie Becker

Keyword(s):

Matrix Factorization ◽

Large Scale ◽

The Cancer Genome Atlas ◽

Added Value ◽

Joint Analysis ◽

Omics Data ◽

Clustering Methods ◽

Data Set ◽

Cancer Data ◽

Opposite Behavior

Abstract Recent advances in sequencing, mass spectrometry and cytometry technologies have enabled researchers to collect large-scale omics data from the same set of biological samples. The joint analysis of multiple omics offers the opportunity to uncover coordinated cellular processes acting across different omic layers. In this work, we present a thorough comparison of a selection of recent integrative clustering approaches, including Bayesian (BCC and MDI) and matrix factorization approaches (iCluster, moCluster, JIVE and iNMF). Based on simulations, the methods were evaluated on their sensitivity and their ability to recover both the correct number of clusters and the simulated clustering at the common and data-specific levels. Standard non-integrative approaches were also included to quantify the added value of integrative methods. For most matrix factorization methods and one Bayesian approach (BCC), the shared and specific structures were successfully recovered with high and moderate accuracy, respectively. An opposite behavior was observed on non-integrative approaches, i.e. high performances on specific structures only. Finally, we applied the methods on the Cancer Genome Atlas breast cancer data set to check whether results based on experimental data were consistent with those obtained in the simulations.

Download Full-text

An Efficient and Easy-to-Use Network-Based Integrative Method of Multi-Omics Data for Cancer Genes Discovery

Frontiers in Genetics ◽

10.3389/fgene.2020.613033 ◽

2021 ◽

Vol 11 ◽

Author(s):

Ting Wei ◽

Botao Fa ◽

Chengwen Luo ◽

Luke Johnston ◽

Yue Zhang ◽

...

Keyword(s):

Area Under The Curve ◽

Interaction Network ◽

Population Level ◽

R Package ◽

Marker Genes ◽

Driver Gene ◽

Omics Data ◽

Cancer Genes ◽

Driver Genes ◽

Different Types

Identifying personalized driver genes is essential for discovering critical biomarkers and developing effective personalized therapies of cancers. However, few methods consider weights for different types of mutations and efficiently distinguish driver genes over a larger number of passenger genes. We propose MinNetRank (Minimum used for Network-based Ranking), a new method for prioritizing cancer genes that sets weights for different types of mutations, considers the incoming and outgoing degree of interaction network simultaneously, and uses minimum strategy to integrate multi-omics data. MinNetRank prioritizes cancer genes among multi-omics data for each sample. The sample-specific rankings of genes are then integrated into a population-level ranking. When evaluating the accuracy and robustness of prioritizing driver genes, our method almost always significantly outperforms other methods in terms of precision, F1 score, and partial area under the curve (AUC) on six cancer datasets. Importantly, MinNetRank is efficient in discovering novel driver genes. SP1 is selected as a candidate driver gene only by our method (ranked top three), and SP1 RNA and protein differential expression between tumor and normal samples are statistically significant in liver hepatocellular carcinoma. The top seven genes stratify patients into two subtypes exhibiting statistically significant survival differences in five cancer types. These top seven genes are associated with overall survival, as illustrated by previous researchers. MinNetRank can be very useful for identifying cancer driver genes, and these biologically relevant marker genes are associated with clinical outcome. The R package of MinNetRank is available at https://github.com/weitinging/MinNetRank.

Download Full-text

OmicsOne: associate omics data with phenotypes in one-click

Clinical Proteomics ◽

10.1186/s12014-021-09334-w ◽

2021 ◽

Vol 18 (1) ◽

Author(s):

Hui Zhang ◽

Minghui Ao ◽

Arianna Boja ◽

Michael Schnaubelt ◽

Yingwei Hu

Keyword(s):

Quality Control ◽

Data Analysis ◽

Association Analysis ◽

Lung Squamous Cell Carcinoma ◽

Data Sets ◽

Omics Data ◽

Data Set ◽

Cancer Data ◽

Public Data ◽

Potential Biomarkers

Abstract Background The rapid advancements of high throughput “omics” technologies have brought a massive amount of data to process during and after experiments. Multi-omic analysis facilitates a deeper interrogation of a dataset and the discovery of interesting genes, proteins, lipids, glycans, metabolites, or pathways related to the corresponding phenotypes in a study. Many individual software tools have been developed for data analysis and visualization. However, it still lacks an efficient way to investigate the phenotypes with multiple omics data. Here, we present OmicsOne as an interactive web-based framework for rapid phenotype association analysis of multi-omic data by integrating quality control, statistical analysis, and interactive data visualization on ‘one-click’. Materials and methods OmicsOne was applied on the previously published proteomic and glycoproteomic data sets of high-grade serous ovarian carcinoma (HGSOC) and the published proteome data set of lung squamous cell carcinoma (LSCC) to confirm its performance. The data was analyzed through six main functional modules implemented in OmicsOne: (1) phenotype profiling, (2) data preprocessing and quality control, (3) knowledge annotation, (4) phenotype associated features discovery, (5) correlation and regression model analysis for phenotype association analysis on individual features, and (6) enrichment analysis for phenotype association analysis on interested feature sets. Results We developed an integrated software solution, OmicsOne, for the phenotype association analysis on multi-omics data sets. The application of OmicsOne on the public data set of ovarian cancer data showed that the software could confirm the previous observations consistently and discover new evidence for HNRNPU and a glycopeptide of HYOU1 as potential biomarkers for HGSOC data sets. The performance of OmicsOne was further demonstrated in the Tumor and NAT comparison study on the proteome data set of LSCC. Conclusions OmicsOne can effectively simplify data analysis and reveal the significant associations between phenotypes and potential biomarkers, including genes, proteins, and glycopeptides, in minutes to assist users to understand aberrant biological processes.

Download Full-text

Identification of deregulation mechanisms specific to cancer subtypes

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720021400035 ◽

2021 ◽

Vol 19 (01) ◽

pp. 2140003

Author(s):

Magali Champion ◽

Julien Chiquet ◽

Pierre Neuvial ◽

Mohamed Elati ◽

François Radvanyi ◽

...

Keyword(s):

Gene Expression ◽

Target Genes ◽

The Cancer Genome Atlas ◽

Expression Data ◽

Data Set ◽

Cancer Subtypes ◽

Cancer Data ◽

Cancer Genome Atlas ◽

Gene Regulatory ◽

Gene Expression Levels

In many cancers, mechanisms of gene regulation can be severely altered. Identification of deregulated genes, which do not follow the regulation processes that exist between transcription factors and their target genes, is of importance to better understand the development of the disease. We propose a methodology to detect deregulation mechanisms with a particular focus on cancer subtypes. This strategy is based on the comparison between tumoral and healthy cells. First, we use gene expression data from healthy cells to infer a reference gene regulatory network. Then, we compare it with gene expression levels in tumor samples to detect deregulated target genes. We finally measure the ability of each transcription factor to explain these deregulations. We apply our method on a public bladder cancer data set derived from The Cancer Genome Atlas project and confirm that it captures hallmarks of cancer subtypes. We also show that it enables the discovery of new potential biomarkers.

Download Full-text

A Case of Malignant Lymphoma that Healed Completely after Oral Administrations of 4-Hydroxybenzaldehyde

Journal of Oncology Research ◽

10.31829/2637-6148/jor2018-1(1)-104 ◽

2018 ◽

Vol 1 (1) ◽

pp. 01-02

Keyword(s):

Cancer Patient ◽

Malignant Lymphoma ◽

Large Dose ◽

Side Effect ◽

Small Dose ◽

Severe Hemorrhage ◽

Tumor Agent

In 1969, Mutsuyuki Kochi [1, 2] developed 4-Hydroxybenzaldehyde for use as a novel anti-tumor agent without side effect and patent it. Accordingly, this medicine is capable of preventing carcinogenesis when used in sufficient quantity. To treat advanced cancers, an oncologist should start with giving the cancer patient a small dose of 4-Hydroxybenzaldehyde to avoid the possible severe hemorrhage of a tumor caused by excessive necrosis. Therefore, it has useful applications in treating lymphomas and leukemias. Consequently, those who have these diseases can receive a considerably large dose of the medicine.

Download Full-text

driveR: a novel method for prioritizing cancer driver genes using somatic genomics data

BMC Bioinformatics ◽

10.1186/s12859-021-04203-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ege Ülgen ◽

O. Uğur Sezerman

Keyword(s):

Biological Knowledge ◽

Driver Gene ◽

Driver Genes ◽

Cancer Driver ◽

Prior Biological Knowledge ◽

Wilcoxon Rank Sum Test ◽

Cancer Genomes ◽

Novel Method ◽

Cancer Driver Genes ◽

Batch Analysis

Abstract Background Cancer develops due to “driver” alterations. Numerous approaches exist for predicting cancer drivers from cohort-scale genomics data. However, methods for personalized analysis of driver genes are underdeveloped. In this study, we developed a novel personalized/batch analysis approach for driver gene prioritization utilizing somatic genomics data, called driveR. Results Combining genomics information and prior biological knowledge, driveR accurately prioritizes cancer driver genes via a multi-task learning model. Testing on 28 different datasets, this study demonstrates that driveR performs adequately, achieving a median AUC of 0.684 (range 0.651–0.861) on the 28 batch analysis test datasets, and a median AUC of 0.773 (range 0–1) on the 5157 personalized analysis test samples. Moreover, it outperforms existing approaches, achieving a significantly higher median AUC than all of MutSigCV (Wilcoxon rank-sum test p < 0.001), DriverNet (p < 0.001), OncodriveFML (p < 0.001) and MutPanning (p < 0.001) on batch analysis test datasets, and a significantly higher median AUC than DawnRank (p < 0.001) and PRODIGY (p < 0.001) on personalized analysis datasets. Conclusions This study demonstrates that the proposed method is an accurate and easy-to-utilize approach for prioritizing driver genes in cancer genomes in personalized or batch analyses. driveR is available on CRAN: https://cran.r-project.org/package=driveR.

Download Full-text

ZRSR2 overexpression is a frequent and early event in castration-resistant prostate cancer development

Prostate Cancer and Prostatic Diseases ◽

10.1038/s41391-021-00322-7 ◽

2021 ◽

Author(s):

Haiqing He ◽

Jun Hao ◽

Xin Dong ◽

Yu Wang ◽

Hui Xue ◽

...

Keyword(s):

Prostate Cancer ◽

Cell Cycle Progression ◽

Locally Advanced ◽

Patient Treatment ◽

Early Event ◽

Driver Gene ◽

Castration Resistant Prostate Cancer ◽

Driver Genes ◽

Castration Resistant ◽

Pdx Models

Abstract Background Androgen deprivation therapy (ADT) remains the leading systemic therapy for locally advanced and metastatic prostate cancers (PCa). While a majority of PCa patients initially respond to ADT, the durability of response is variable and most patients will eventually develop incurable castration-resistant prostate cancer (CRPC). Our research objective is to identify potential early driver genes responsible for CRPC development. Methods We have developed a unique panel of hormone-naïve PCa (HNPC) patient-derived xenograft (PDX) models at the Living Tumor Laboratory. The PDXs provide a unique platform for driver gene discovery as they allow for the analysis of differentially expressed genes via transcriptomic profiling at various time points after mouse host castration. In the present study, we focused on genes with expression changes shortly after castration but before CRPC has fully developed. These are likely to be potential early drivers of CRPC development. Such genes were further validated for their clinical relevance using data from PCa patient databases. ZRSR2 was identified as a top gene candidate and selected for further functional studies. Results ZRSR2 is significantly upregulated in our PDX models during the early phases of CRPC development after mouse host castration and remains consistently high in fully developed CRPC PDX models. Moreover, high ZRSR2 expression is also observed in clinical CRPC samples. Importantly, elevated ZRSR2 in PCa samples is correlated with poor patient treatment outcomes. ZRSR2 knockdown reduced PCa cell proliferation and delayed cell cycle progression at least partially through inhibition of the Cyclin D1 (CCND1) pathway. Conclusion Using our unique HNPC PDX models that develop into CRPC after host castration, we identified ZRSR2 as a potential early driver of CRPC development.

Download Full-text