scholarly journals A Deep Learning Approach to the Screening of Oncogenic Gene Fusions in Humans

2019 ◽  
Vol 20 (7) ◽  
pp. 1645 ◽  
Author(s):  
Marta Lovino ◽  
Gianvito Urgese ◽  
Enrico Macii ◽  
Santa Di Cataldo ◽  
Elisa Ficarra

Gene fusions have a very important role in the study of cancer development. In this regard, predicting the probability of protein fusion transcripts of developing into a cancer is a very challenging and yet not fully explored research problem. To this date, all the available approaches in literature try to explain the oncogenic potential of gene fusions based on protein domain analysis, that is cancer-specific and not easy to adapt to newly developed information. In our work, we choose the raw protein sequences as the input baseline, and propose the use of deep learning, and more specifically Convolutional Neural Networks, to infer the oncogenity probability score of gene fusion transcripts and to group them into a number of categories (e.g., oncogenic/not oncogenic). This is an inherently flexible methodology that, unlike previous approaches, can be re-trained with very less efforts on newly available data (for example, from a different cancer). Based on experimental results on a large dataset of pre-annotated gene fusions, our method is able to predict the oncogenity potential of gene fusion transcripts with accuracy of about 72%, which increases to 86% if we consider the only instances that are classified with a high confidence level.

2002 ◽  
Vol 20 (11) ◽  
pp. 2672-2679 ◽  
Author(s):  
Poul H.B. Sorensen ◽  
James C. Lynch ◽  
Stephen J. Qualman ◽  
Roberto Tirabosco ◽  
Jerian F. Lim ◽  
...  

PURPOSE: Alveolar rhabdomyosarcoma (ARMS) is an aggressive soft tissue malignancy of children and adolescents. Most ARMS patients express PAX3-FKHR or PAX7-FKHR gene fusions resulting from t(2;13) or t(1;13) translocations, respectively. We wished to confirm the diagnostic specificity of gene fusion detection in a large cohort of RMS patients and to evaluate whether these alterations influence clinical outcome in ARMS. PATIENTS AND METHODS: We determined PAX3-FKHR or PAX7-FKHR fusion status in 171 childhood rhabdomyosarcoma (RMS) patients entered onto the Intergroup Rhabdomyosarcoma Study IV, including 78 ARMS patients, using established reverse transcriptase polymerase chain reaction assays. All patients received central pathologic review and were treated using uniform protocols, allowing for meaningful outcome analysis. We examined the relationship between gene fusion status and clinical outcome in the ARMS cohort. RESULTS: PAX3-FKHR and PAX7-FKHR fusion transcripts were detected in 55% and 22% of ARMS patients, respectively; 23% were fusion-negative. All other RMS patients lacked transcripts, confirming the specificity of these alterations for ARMS. Fusion status was not associated with outcome differences in patients with locoregional ARMS. However, in patients presenting with metastatic disease, there was a striking difference in outcome between PAX7-FKHR and PAX3-FKHR patient groups (estimated 4-year overall survival rate of 75% for PAX7-FKHR v 8% for PAX3-FKHR; P = .0015). Multivariate analysis demonstrated a significantly increased risk of failure (P = .025) and death (P = .019) in patients with metastatic disease if their tumors expressed PAX3-FKHR. Among metastatic ARMS, bone marrow involvement was significantly higher in PAX3-FKHR–positive patients. CONCLUSION: Not only are PAX-FKHR fusion transcripts specific for ARMS, but expression of PAX3-FKHR and PAX7-FKHR identifies a very high-risk subgroup and a favorable outcome subgroup, respectively, among patients presenting with metastatic ARMS.


2020 ◽  
Vol 36 (10) ◽  
pp. 3248-3250
Author(s):  
Marta Lovino ◽  
Maria Serena Ciaburri ◽  
Gianvito Urgese ◽  
Santa Di Cataldo ◽  
Elisa Ficarra

Abstract Summary In the last decade, increasing attention has been paid to the study of gene fusions. However, the problem of determining whether a gene fusion is a cancer driver or just a passenger mutation is still an open issue. Here we present DEEPrior, an inherently flexible deep learning tool with two modes (Inference and Retraining). Inference mode predicts the probability of a gene fusion being involved in an oncogenic process, by directly exploiting the amino acid sequence of the fused protein. Retraining mode allows to obtain a custom prediction model including new data provided by the user. Availability and implementation Both DEEPrior and the protein fusions dataset are freely available from GitHub at (https://github.com/bioinformatics-polito/DEEPrior). The tool was designed to operate in Python 3.7, with minimal additional libraries. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Silvia R. Vitale ◽  
Kirsten Ruigrok-Ritstier ◽  
A. Mieke Timmermans ◽  
Renée Foekens ◽  
Anita M.A.C. Trapman-Jansen ◽  
...  

Abstract Background: In breast cancer (BC), recurrent fusion genes of estrogen receptor alpha (ESR1) and AKAP12, ARMT1 and CCDC170 have been reported. In these gene fusions the ligand binding domain of ESR1 has been replaced by the transactivation domain of the fusion partner constitutively activating the receptor. As a result, these gene fusions can drive tumor growth hormone independently as been shown in preclinical models, but the clinical value of these fusions have not been reported. Here, we studied the prognostic and predictive value of different frequently reported ESR1 fusion transcripts in primary BC. Methods: We evaluated 732 patients with primary BC (131 ESR1-negative and 601 ESR1-positive cases), including two ER-positive BC patient cohorts: one cohort of 322 patients with advanced disease who received first-line endocrine therapy (ET) (predictive cohort), and a second cohort of 279 patients with lymph node negative disease (LNN) who received no adjuvant systemic treatment (prognostic cohort). Fusion gene transcript levels were measured by reverse transcriptase quantitative PCR. The presence of the different fusion transcripts was associated, in uni- and multivariable Cox regression analysis taking along current clinic-pathological characteristics, to progression free survival (PFS) during first-line endocrine therapy in the predictive cohort, and disease- free survival (DFS) and overall survival (OS) in the prognostic cohort.Results: The ESR1-CCDC170 fusion transcript was present in 27.6% of the ESR1-positive BC subjects and in 2.3% of the ESR1-negative cases. In the predictive cohort, none of the fusion transcripts were associated with response to first-line ET. In the prognostic cohort, the median DFS and OS were respectively 37 and 93 months for patients with an ESR1-CCDC170 exon 8 gene fusion transcript and respectively 91 and 212 months for patients without this fusion transcript. In a multivariable analysis, this ESR1-CCDC170 fusion transcript was an independent prognostic factor for DFS (HR) (95% confidence interval (CI): 1.8 (1.2–2.8), P=0.005) and OS (HR (95% CI: 1.7 (1.1–2.7), P=0.023). Conclusions: Our study shows that in primary BC only ESR1-CCDC170 exon 8 gene fusion transcript carries prognostic value. None of the ESR1 fusion transcripts, which are considered to have constitutive ER activity, was predictive for outcome in BC with advanced disease treated with endocrine treatment.


2020 ◽  
Author(s):  
Zijie Jin ◽  
Wenjian Huang ◽  
Ning Shen ◽  
Juan Li ◽  
Xiaochen Wang ◽  
...  

AbstractGene fusions are widespread in tumor cells and can play important roles in tumor initiation and progression. Using full length single cell RNA sequencing (scRNA-seq), gene fusions can now be detected at single cell level by analyzing chimeric reads in scRNA-seq. However, scRNA-seq data has a high noise level and contains various technical artefacts. Direct application of fusion detection tools developed for bulk data can lead to spurious fusion discoveries and leave some true fusions undetected. In this paper, we present a computational tool, scFusion, for gene fusion detection based on scRNA-seq. scFusion is composed of a statistical model and a deep learning model, both of which are designed to control for potential false discoveries. The statistical model models the background noise as zero inflated negative binomial and uses a statistical testing procedure to control for false positives. The deep learning model is trained to recognize technical chimeric artefacts and filter false fusion candidates generated by these artefacts. We compared scFusion with bulk fusion detection methods using simulation data created based on real scRNA-seq data and found that scFusion had superior performance. Applying scFusion to a T cell data, scFusion successfully detected the invariant TCR gene recombinations in Mucosal-associated invariant T cells that many bulk methods failed to detect. In a multiple myeloma data, scFusion detected the known recurrent fusion IgH-WHSC1, which was associated with overexpression of the WHSC1 oncogene.SignificanceA critical challenge for fusion detection based on the full-length single cell RNA sequencing (scRNA-seq) is to identify the needles, or the true fusions, from a large haystack of false positives. We developed a fusion detection tool scFusion for scRNA-seq. scFusion is computationally more efficient, has far less false discoveries while achieves similar detection power compared to fusion detection tools developed for bulk data. Application of scFusion to a multiple myeloma dataset identied subclones with the fusion IgH-WHSC1 and revealed that over-expression of the oncogene WHSC1 was strongly associated with the fusion. The models developed in this work may also be generalized for other single cell analyses such as structural variation detection and the alternative splicing analysis.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2013
Author(s):  
Edian F. Franco ◽  
Pratip Rana ◽  
Aline Cruz ◽  
Víctor V. Calderón ◽  
Vasco Azevedo ◽  
...  

A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.


2021 ◽  
Author(s):  
Simon Haefliger ◽  
Muriel Genevay ◽  
Michel Bihl ◽  
Romina Marone ◽  
Daniel Baumhoer ◽  
...  

AbstractMyoepithelial neoplasms of soft tissue are rare tumors with clinical, morphological, immunohistochemical, and genetic heterogeneity. The morphological spectrum of these tumors is broad, and the diagnosis often requires immunostaining to confirm myoepithelial differentiation. Rarely, tumors show a morphology that is typical for myoepithelial neoplasms, while the immunophenotype fails to confirm myoepithelial differentiation. For such lesions, the term “myoepithelioma-like” tumor was introduced. Recently, two cases of myoepithelioma-like tumors of the hands and one case of the foot were described with previously never reported OGT-FOXO gene fusions. Here, we report a 50-year-old woman, with a myoepithelial-like tumor localized in the soft tissue of the forearm and carrying a OGT-FOXO1 fusion gene. Our findings extend the spectrum of mesenchymal tumors involving members of the FOXO family of transcription factors and point to the existence of a family of soft tissue tumors that carry the gene fusion of the OGT-FOXO family.


1996 ◽  
Vol 92 (4) ◽  
pp. 866-871 ◽  
Author(s):  
Ingrid Simonitsch ◽  
Eva Renate Panzer‐Gruemayer ◽  
Daniel W. Ghali ◽  
Andreas Zoubek ◽  
ThaddÄus Radaszkiewicz ◽  
...  

2021 ◽  
pp. jclinpath-2021-207825
Author(s):  
Umberto Malapelle ◽  
Francesco Pepe ◽  
Pasquale Pisapia ◽  
Annalisa Altimari ◽  
Claudio Bellevicine ◽  
...  

AimsGene fusions assays are key for personalised treatments of advanced human cancers. Their implementation on cytological material requires a preliminary validation that may make use of cell line slides mimicking cytological samples. In this international multi-institutional study, gene fusion reference standards were developed and validated.MethodsCell lines harbouring EML4(13)–ALK(20) and SLC34A2(4)–ROS1(32) gene fusions were adopted to prepare reference standards. Eight laboratories (five adopting amplicon-based and three hybridisation-based platforms) received, at different dilution points two sets of slides (slide A 50.0%, slide B 25.0%, slide C 12.5% and slide D wild type) stained by Papanicolaou (Pap) and May Grunwald Giemsa (MGG). Analysis was carried out on a total of 64 slides.ResultsFour (50.0%) out of eight laboratories reported results on all slides and dilution points. While 12 (37.5%) out of 32 MGG slides were inadequate, 27 (84.4%) out of 32 Pap slides produced libraries adequate for variant calling. The laboratories using hybridisation-based platforms showed the highest rate of inadequate results (13/24 slides, 54.2%). Conversely, only 10.0% (4/40 slides) of inadequate results were reported by laboratories adopting amplicon-based platforms.ConclusionsReference standards in cytological format yield better results when Pap staining and processed by amplicon-based assays. Further investigation is required to optimise these standards for MGG stained cells and for hybridisation-based approaches.


Sign in / Sign up

Export Citation Format

Share Document