scholarly journals Classifying Breast Cancer Subtypes Using Deep Neural Networks Based on Multi-Omics Data

Genes ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 888
Author(s):  
Yuqi Lin ◽  
Wen Zhang ◽  
Huanshen Cao ◽  
Gaoyang Li ◽  
Wei Du

With the high prevalence of breast cancer, it is urgent to find out the intrinsic difference between various subtypes, so as to infer the underlying mechanisms. Given the available multi-omics data, their proper integration can improve the accuracy of breast cancer subtype recognition. In this study, DeepMO, a model using deep neural networks based on multi-omics data, was employed for classifying breast cancer subtypes. Three types of omics data including mRNA data, DNA methylation data, and copy number variation (CNV) data were collected from The Cancer Genome Atlas (TCGA). After data preprocessing and feature selection, each type of omics data was input into the deep neural network, which consists of an encoding subnetwork and a classification subnetwork. The results of DeepMO based on multi-omics on binary classification are better than other methods in terms of accuracy and area under the curve (AUC). Moreover, compared with other methods using single omics data and multi-omics data, DeepMO also had a higher prediction accuracy on multi-classification. We also validated the effect of feature selection on DeepMO. Finally, we analyzed the enrichment gene ontology (GO) terms and biological pathways of these significant genes, which were discovered during the feature selection process. We believe that the proposed model is useful for multi-omics data analysis.

Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 200 ◽  
Author(s):  
Mingxin Tao ◽  
Tianci Song ◽  
Wei Du ◽  
Siyu Han ◽  
Chunman Zuo ◽  
...  

It is very significant to explore the intrinsic differences in breast cancer subtypes. These intrinsic differences are closely related to clinical diagnosis and designation of treatment plans. With the accumulation of biological and medicine datasets, there are many different omics data that can be viewed in different aspects. Combining these multiple omics data can improve the accuracy of prediction. Meanwhile; there are also many different databases available for us to download different types of omics data. In this article, we use estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) to define breast cancer subtypes and classify any two breast cancer subtypes using SMO-MKL algorithm. We collected mRNA data, methylation data and copy number variation (CNV) data from TCGA to classify breast cancer subtypes. Multiple Kernel Learning (MKL) is employed to use these omics data distinctly. The result of using three omics data with multiple kernels is better than that of using single omics data with multiple kernels. Furthermore; these significant genes and pathways discovered in the feature selection process are also analyzed. In experiments; the proposed method outperforms other state-of-the-art methods and has abundant biological interpretations.


2012 ◽  
Vol 30 (15_suppl) ◽  
pp. 1041-1041
Author(s):  
Joaquina Martínez-Galan ◽  
Sandra Rios ◽  
Juan Ramon Delgado ◽  
Blanca Torres-Torres ◽  
Jesus Lopez-Peñalver ◽  
...  

1041 Background: Identification of gene expression-based breast cancer subtypes is considered a critical means of prognostication. Genetic mutations along with epigenetic alterations contribute to gene-expression changes occurring in breast cancer. However, the reproducibility of differential DNA methylation discoveries for cancer and the relationship between DNA methylation and aberrant gene expression have not been systematically analysed. The present study was undertaken to dissect the breast cancer methylome and to deliver specific epigenotypes associated with particular breast cancer subtypes. Methods: By using Real Time QMSPCR SYBR green we analyzed DNA methylation in regulatory regions of 107 pts with breast cancer and analyzed association with prognostics factor in triple negative breast cancer and methylation promoter ESR1, APC, E-Cadherin, Rar B and 14-3-3 sigma. Results: We identified novel subtype-specific epigenotypes that clearly demonstrate the differences in the methylation profiles of basal-like and human epidermal growth factor 2 (HER2)-overexpressing tumors. Of the cases, 37pts (40%) were Luminal A (LA), 32pts (33%) Luminal B (LB), 14pts (15%) Triple-negative (TN), and 9pts (10%) HER2+. DNA hypermethylation was highly inversely correlated with the down-regulation of gene expression. Methylation of this panel of promoter was found more frequently in triple negative and HER2 phenotype. ESR1 was preferably associated with TN(80%) and HER2+(60%) subtype. With a median follow up of 6 years, we found worse overall survival (OS) with more frequent ESR1 methylation gene(p>0.05), Luminal A;ESR1 Methylation OS at 5 years 81% vs 93% when was ESR1 Unmethylation. Luminal B;ESR1 Methylation 86% SG at 5 years vs 92% in Unmethylation ESR1. Triple negative;ESR1 Methylation SG at 5 years 75% vs 80% in unmethylation ESR1. HER2;ESR1 Methylation SG at 5 years was 66.7% vs 75% in unmethylation ESR1. Conclusions: Our results provide evidence that well-defined DNA methylation profiles enable breast cancer subtype prediction and support the utilization of this biomarker for prognostication and therapeutic stratification of patients with breast cancer.


2016 ◽  
Vol 14 (05) ◽  
pp. 1644002 ◽  
Author(s):  
Jinwoo Park ◽  
Benjamin Hur ◽  
Sungmin Rhee ◽  
Sangsoo Lim ◽  
Min-Su Kim ◽  
...  

A breast cancer subtype classification scheme, PAM50, based on genetic information is widely accepted for clinical applications. On the other hands, experimental cancer biology studies have been successful in revealing the mechanisms of breast cancer and now the hallmarks of cancer have been determined to explain the core mechanisms of tumorigenesis. Thus, it is important to understand how the breast cancer subtypes are related to the cancer core mechanisms, but multiple studies are yet to address the hallmarks of breast cancer subtypes. Therefore, a new approach that can explain the differences among breast cancer subtypes in terms of cancer hallmarks is needed. We developed an information theoretic sub-network mining algorithm, differentially expressed sub-network and pathway analysis (DeSPA), that retrieves tumor-related genes by mining a gene regulatory network (GRN) of transcription factors and miRNAs. With extensive experiments of the cancer genome atlas (TCGA) breast cancer sequencing data, we showed that our approach was able to select genes that belong to cancer core pathways such as DNA replication, cell cycle, p53 pathways while keeping the accuracy of breast cancer subtype classification comparable to that of PAM50. In addition, our method produces a regulatory network of TF, miRNA, and their target genes that distinguish breast cancer subtypes, which is confirmed by experimental studies in the literature.


2021 ◽  
Vol 23 (1) ◽  
Author(s):  
Nicole J. Chew ◽  
Terry C. C. Lim Kam Sian ◽  
Elizabeth V. Nguyen ◽  
Sung-Young Shin ◽  
Jessica Yang ◽  
...  

Abstract Background Particular breast cancer subtypes pose a clinical challenge due to limited targeted therapeutic options and/or poor responses to the existing targeted therapies. While cell lines provide useful pre-clinical models, patient-derived xenografts (PDX) and organoids (PDO) provide significant advantages, including maintenance of genetic and phenotypic heterogeneity, 3D architecture and for PDX, tumor–stroma interactions. In this study, we applied an integrated multi-omic approach across panels of breast cancer PDXs and PDOs in order to identify candidate therapeutic targets, with a major focus on specific FGFRs. Methods MS-based phosphoproteomics, RNAseq, WES and Western blotting were used to characterize aberrantly activated protein kinases and effects of specific FGFR inhibitors. PDX and PDO were treated with the selective tyrosine kinase inhibitors AZD4547 (FGFR1-3) and BLU9931 (FGFR4). FGFR4 expression in cancer tissue samples and PDOs was assessed by immunohistochemistry. METABRIC and TCGA datasets were interrogated to identify specific FGFR alterations and their association with breast cancer subtype and patient survival. Results Phosphoproteomic profiling across 18 triple-negative breast cancers (TNBC) and 1 luminal B PDX revealed considerable heterogeneity in kinase activation, but 1/3 of PDX exhibited enhanced phosphorylation of FGFR1, FGFR2 or FGFR4. One TNBC PDX with high FGFR2 activation was exquisitely sensitive to AZD4547. Integrated ‘omic analysis revealed a novel FGFR2-SKI fusion that comprised the majority of FGFR2 joined to the C-terminal region of SKI containing the coiled-coil domains. High FGFR4 phosphorylation characterized a luminal B PDX model and treatment with BLU9931 significantly decreased tumor growth. Phosphoproteomic and transcriptomic analyses confirmed on-target action of the two anti-FGFR drugs and also revealed novel effects on the spliceosome, metabolism and extracellular matrix (AZD4547) and RIG-I-like and NOD-like receptor signaling (BLU9931). Interrogation of public datasets revealed FGFR2 amplification, fusion or mutation in TNBC and other breast cancer subtypes, while FGFR4 overexpression and amplification occurred in all breast cancer subtypes and were associated with poor prognosis. Characterization of a PDO panel identified a luminal A PDO with high FGFR4 expression that was sensitive to BLU9931 treatment, further highlighting FGFR4 as a potential therapeutic target. Conclusions This work highlights how patient-derived models of human breast cancer provide powerful platforms for therapeutic target identification and analysis of drug action, and also the potential of specific FGFRs, including FGFR4, as targets for precision treatment.


2021 ◽  
pp. 1-14
Author(s):  
S. Raja Sree ◽  
A. Kunthavai

BACKGROUND: Breast cancer is a major disease causing panic among women worldwide. Since gene mutations are the root cause for cancer development, analyzing gene expressions can give more insights into various phenotype of cancer treatments. Breast Cancer subtype prediction from gene expression data can provide more information for cancer treatment decisions. OBJECTIVE: Gene expressions are complex for analysis due to its high dimensional nature. Machine learning algorithms such as k-Nearest Neighbors, Support Vector Machine (SVM) and Random Forest are used with selection of features for prediction of breast cancer subtypes. Prediction accuracy of the existing methods are affected due to high dimensional nature of gene expressions. The objective of the work is to propose an efficient algorithm for the prediction of breast cancer subtypes from gene expression. METHODS: For subtype prediction, a novel Hubness Weighted Support Vector machine algorithm (HWSVM) using bad hubness score as a weight measure to handle the outliers in the data has been proposed. Based on the various subtypes, features are projected into seven different feature sets and Ensemble based Hubness Aware Weighted Support Vector Machine (HWSVMEns) is implemented for breast cancer subtype prediction. RESULTS: The proposed algorithms have been compared with the classical SVM and other traditional algorithms such as Random Forest, k-Nearest Neighbor algorithms and also with various gene selection methods. CONCLUSIONS: Experimental results show that the proposed HWSVM outperforms other algorithms in terms of accuracy, precision, recall and F1 score due to the hubness weightage scheme and the ensemble approach. The experiments have shown an average accuracy of 92% across various gene expression datasets.


2020 ◽  
pp. 480-490 ◽  
Author(s):  
Zixiao Lu ◽  
Siwen Xu ◽  
Wei Shao ◽  
Yi Wu ◽  
Jie Zhang ◽  
...  

PURPOSE Tumor-infiltrating lymphocytes (TILs) and their spatial characterizations on whole-slide images (WSIs) of histopathology sections have become crucial in diagnosis, prognosis, and treatment response prediction for different cancers. However, fully automatic assessment of TILs on WSIs currently remains a great challenge because of the heterogeneity and large size of WSIs. We present an automatic pipeline based on a cascade-training U-net to generate high-resolution TIL maps on WSIs. METHODS We present global cell-level TIL maps and 43 quantitative TIL spatial image features for 1,000 WSIs of The Cancer Genome Atlas patients with breast cancer. For more specific analysis, all the patients were divided into three subtypes, namely, estrogen receptor (ER)–positive, ER-negative, and triple-negative groups. The associations between TIL scores and gene expression and somatic mutation were examined separately in three breast cancer subtypes. Both univariate and multivariate survival analyses were performed on 43 TIL image features to examine the prognostic value of TIL spatial patterns in different breast cancer subtypes. RESULTS The TIL score was in strong association with immune response pathway and genes (eg, programmed death-1 and CLTA4). Different breast cancer subtypes showed TIL score in association with mutations from different genes suggesting that different genetic alterations may lead to similar phenotypes. Spatial TIL features that represent density and distribution of TIL clusters were important indicators of the patient outcomes. CONCLUSION Our pipeline can facilitate computational pathology-based discovery in cancer immunology and research on immunotherapy. Our analysis results are available for the research community to generate new hypotheses and insights on breast cancer immunology and development.


2019 ◽  
Vol 17 (6) ◽  
pp. 676-686 ◽  
Author(s):  
Mei-Chin Hsieh ◽  
Lu Zhang ◽  
Xiao-Cheng Wu ◽  
Mary B. Davidson ◽  
Michelle Loch ◽  
...  

Background: Breast cancer subtype is a key determinant in treatment decision-making, and also effects survival outcome. In this population-based study, in-depth analyses were performed to examine the impact that breast cancer subtype and receipt of guideline-concordant adjuvant systemic therapy (AST) have on survival using a population-based cancer registry’s data. Methods: Women aged ≥20 years with microscopically confirmed stage I–III breast cancer diagnosed in 2011 were identified from the Louisiana Tumor Registry. Breast cancer subtypes were categorized based on hormone receptor (HR) and HER2 status. Guideline-concordant treatment was defined using the NCCN Guidelines for Breast Cancer. Logistic regression was applied to identify factors associated with guideline-concordant AST receipt. Kaplan-Meier survival curves were generated to compare survival among subtypes by AST receipt status, and a semiparametric additive hazard model was used to verify the factors impacting survival outcome. Results: Of 2,214 eligible patients, most (70.8%) were HR+/HER2– followed by HR–/HER2– (14.4%), and 78.6% received guideline-concordant AST. Compared with patients with the HR+/HER2+ subtype, women with other subtypes were more likely to be guideline-concordant after adjusting for sociodemographic and clinical variables. Women with the HR–/HER2+ or HR–/HER2– subtype had a higher risk of any-cause and breast cancer–specific death than those with the HR+/HER2+ subtype. Those who did not receive AST had an additional adjusted hazard of 0.0191 (P=.0001) in overall survival and 0.0126 (P=.0011) in cause-specific survival compared with those who received AST. Conclusions: Most patients received guideline-concordant AST, except for those with the HR+/HER2+ subtype. Patients receiving guideline-adherent adjuvant therapy had better survival outcomes across all breast cancer subtypes.


2021 ◽  
Author(s):  
Surbhi Bansil ◽  
Anthony Silva ◽  
Corinne Jones ◽  
Elena Hidalgo ◽  
Ian Pagano ◽  
...  

Abstract PurposeDifferences in incidence of breast cancer subtypes among racial/ethnic groups have been evaluated as a contributing factor in the disparities seen in breast cancer prognosis. We evaluated new breast cancer cases in Hawaii to determine if there were subtype differences according to race/ethnicity that may contribute to known disparities.MethodsWe reviewed 4,318 cases of women diagnosed with breast cancer from two large tumor registries between 2013-2020. We evaluated the new breast cancer cases according to age at diagnosis, self-reported race, and breast cancer subtype (ER, PR, and HER2 receptor status).ResultsWe found both premenopausal and postmenopausal Native Hawaiian women were less likely to be diagnosed with triple negative breast cancer (OR=0.33, P=0.009; OR=0.62, P=0.03 respectively). Premenopausal Japanese women were 71% less likely to be diagnosed with triple positive (ER+/PR+/HER2+) breast cancer (OR=0.29, P=0.0003). Postmenopausal Filipino women were 89% more likely to be diagnosed with ER-/PR-/HER2+ breast cancer (OR=1.89, P=0.02). ConclusionsResults of our study support that there are racial/ethnic differences in breast cancer subtypes among our population which may contribute to the differences in outcome seen. Further evaluation of other clinical and pathological features in each breast cancer subtype may inform potential mechanisms for outcome disparities seen among different racial/ethnic groups.


2018 ◽  
Author(s):  
Diana Diaz ◽  
Aliccia Bollig-Fischer ◽  
Alexander Kotov

ABSTRACTObjectiveTo investigate application of non-negative tensor decomposition for disease subtype discovery based on joint analysis of clinical and genomic data.Data and MethodsSomatic mutation profiles including 11,996 genes of 503 breast cancer patients from the Cancer Genome Atlas (TCGA) along with 11 clinical variables and markers of these patients were used to construct a binary third-order tensor. CANDECOMP/PARAFAC method was applied to decompose the constructed tensor into rank-one component tensors. Definitions of breast cancer verotypes were constructed from the patient, gene and clinical vectors corresponding to each component tensor. Patient membership proportions in the identified verotypes were utilized in a Cox proportional hazards model to predict their survival.ResultsQualitative evaluation of the verotypes obtained by tensor factorization indicates that they correspond to clinically meaningful breast cancer subtypes. While some components correspond to the known HER2- or ER-positive breast cancer subtypes, other components correspond to a variant of triple negative subtype and a cohort of patients with high mutation load of tumor suppressor genes. Quantitative evaluation indicates that the Cox model utilizing computationally discovered breast cancer verotypes is more accurate (AUC=0.5796) at predicting patient survival than the Cox models utilizing random patient membership proportions in cancer subtypes (AUC=0.4056) as well as patient membership proportions in genotypes (AUC=0.4731) and phenotypes (AUC=0.5047) obtained by non-negative factorization of the somatic mutation and clinical matrices.ConclusionNon-negative factorization of a binary tensor constructed from clinical and genomic data enables high-throughput discovery of breast cancer verotypes that are effective at predicting patient survival.


Sign in / Sign up

Export Citation Format

Share Document