scholarly journals Scanning window analysis of non-coding regions within normal-tumor whole-genome sequence samples

Author(s):  
J P Torcivia ◽  
R Mazumder

Abstract Genomics has benefited from an explosion in affordable high-throughput technology for whole-genome sequencing. The regulatory and functional aspects in non-coding regions may be an important contributor to oncogenesis. Whole-genome tumor-normal paired alignments were used to examine the non-coding regions in five cancer types and two races. Both a sliding window and a binning strategy were introduced to uncover areas of higher than expected variation for additional study. We show that the majority of cancer associated mutations in 154 whole-genome sequences covering breast invasive carcinoma, colon adenocarcinoma, kidney renal papillary cell carcinoma, lung adenocarcinoma and uterine corpus endometrial carcinoma cancers and two races are found outside of the coding region (4 432 885 in non-gene regions versus 1 412 731 in gene regions). A pan-cancer analysis found significantly mutated windows (292 to 3881 in count) demonstrating that there are significant numbers of large mutated regions in the non-coding genome. The 59 significantly mutated windows were found in all studied races and cancers. These offer 16 regions ripe for additional study within 12 different chromosomes—2, 4, 5, 7, 10, 11, 16, 18, 20, 21 and X. Many of these regions were found in centromeric locations. The X chromosome had the largest set of universal windows that cluster almost exclusively in Xq11.1—an area linked to chromosomal instability and oncogenesis. Large consecutive clusters (super windows) were found (19 to 114 in count) providing further evidence that large mutated regions in the genome are influencing cancer development. We show remarkable similarity in highly mutated non-coding regions across both cancer and race.

2017 ◽  
Author(s):  
Ya Cui ◽  
Yiwei Niu ◽  
Xueyi Teng ◽  
Dan Wang ◽  
Huaxia Luo ◽  
...  

AbstractWhole genome sequencing technology has facilitated the discovery of a large number of somatic mutations in enhancers (SMEs), whereas the utility of SMEs in tumorigenesis has not been fully explored. Here we present Ennet, a method to comprehensively investigate SMEs enriched networks (SME-networks) in cancer by integrating SMEs, enhancer-gene interactions and gene-gene interactions. Using Ennet, we performed a pan-cancer analysis in 2004 samples from 8 cancer types and found many well-known cancer drivers were involved in the SME-networks, includingESR1,SMAD3,MYC,EGFR,BCL2andPAX5. Meanwhile, Ennet also identified many new networks with less characterization but have potentially important roles in cancer, including a large SME-network in medulloblastoma (MB), which contains genes enriched in the glutamate receptor and neural development pathways. Interestingly, SME-networks are specific across cancer types, and the vast majority of the genes identified by Ennet have few mutations in gene bodies. Collectively, our work suggests that using enhancer-only somatic mutations can be an effective way to discover potential cancer-driving networks. Ennet provides a new perspective to explore new mechanisms for tumor progression from SMEs.


2018 ◽  
Author(s):  
Viraj Deshpande ◽  
Jens Luebeck ◽  
Mehrdad Bakhtiari ◽  
Nam-Phuong D Nguyen ◽  
Kristen M Turner ◽  
...  

AbstractFocal oncogene amplification and rearrangements drive tumor growth and evolution in multiple cancer types. We developed a tool, AmpliconArchitect (AA), which can robustly reconstruct the fine structure of focally amplified regions using whole genome sequencing. AA-reconstructed amplicons in pan-cancer data and in virus-driven cervical cancer samples revealed many novel insights about focal amplifications. Specifically, the findings lend support to extrachromosomally mediated mechanisms for copy number expansion, and oncoviral pathogenesis.


2020 ◽  
Author(s):  
Eric Minwei Liu ◽  
Alexander Martinez-Fundichely ◽  
Rajesh Bollapragada ◽  
Maurice Spiewack ◽  
Ekta Khurana

ABSTRACTMost mutations in cancer genomes occur in the non-coding regions with unknown impact to tumor development. Although the increase in number of cancer whole-genome sequences has revealed numerous putative non-coding cancer drivers, their information is dispersed across multiple studies and thus it is difficult to bridge the understanding of non-coding alterations, the genes they impact and the supporting evidence for their role in tumorigenesis across multiple cancer types. To address this gap, we have developed CNCDatabase, Cornell Non-Coding Cancer driver Database (https://cncdatabase.med.cornell.edu/) that contains detailed information about predicted non-coding drivers at gene promoters, 5’ and 3’ UTRs (untranslated regions), enhancers, CTCF insulators and non-coding RNAs. CNCDatabase documents 1,111 protein-coding genes and 90 non-coding RNAs with reported drivers in their non-coding regions from 32 cancer types by computational predictions of positive selection in whole-genome sequences; differential gene expression in samples with and without mutations; or another set of experimental validations including luciferase reporter assays and genome editing. The database can be easily modified and scaled as lists of non-coding drivers are revised in the community with larger whole-genome sequencing studies, CRISPR screens and further experimental validations. Overall, CNCDatabase provides a helpful resource for researchers to explore the pathological role of non-coding alterations and their associations with gene expression in human cancers.


Cancers ◽  
2021 ◽  
Vol 13 (16) ◽  
pp. 4197
Author(s):  
Roni Rasnic ◽  
Michal Linial

During the past decade, whole-genome sequencing of tumor biopsies and individuals with congenital disorders highlighted the phenomenon of chromoanagenesis, a single chaotic event of chromosomal rearrangement. Chromoanagenesis was shown to be frequent in many types of cancers, to occur in early stages of cancer development, and significantly impact the tumor’s nature. However, an in-depth, cancer-type dependent analysis has been somewhat incomplete due to the shortage in whole genome sequencing of cancerous samples. In this study, we extracted data from The Pan-Cancer Analysis of Whole Genome (PCAWG) and The Cancer Genome Atlas (TCGA) to construct and test a machine learning algorithm that can detect chromoanagenesis with high accuracy (86%). The algorithm was applied to ~10,000 unlabeled TCGA cancer patients. We utilize the chromoanagenesis assignment results, to analyze cancer-type specific chromoanagenesis characteristics in 20 TCGA cancer types. Our results unveil prominent genes affected in either chromoanagenesis or non-chromoanagenesis tumorigenesis. The analysis reveals a mutual exclusivity relationship between the genes impaired in chromoanagenesis versus non-chromoanagenesis cases. We offer the discovered characteristics as possible targets for cancer diagnostic and therapeutic purposes.


Viruses ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 2104
Author(s):  
Tyler Doerksen ◽  
Andrea Lu ◽  
Lance Noll ◽  
Kelli Almes ◽  
Jianfa Bai ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) descriptions of infection and transmission have been increasing in companion animals in the past year. Although canine susceptibility is generally considered low, their role in the COVID-19 disease cycle remains unknown. In this study, we detected and sequenced a delta variant (AY.3) from a 12-year-old Collie living with owners that previously tested positive for SARS-CoV-2. It is unclear if the dogs’ symptoms were related to SARS-CoV-2 infection or underlying conditions. The whole genome sequence obtained from the dog sample had several unique consensus level changes not previously identified in a SARS-CoV-2 genome that may play a role in the rapid adaptation from humans to dogs. Within the spike coding region, 5/7 of the subconsensus variants identified in the dog sequence were also identified in the closest in-house human reference case. Taken together, the whole genome sequence, and phylogenetic and subconsensus variant analyses indicate the virus infecting the animal originated from a local outbreak cluster. The results of these analyses emphasize the importance of rapid detection and characterization of SARS-CoV-2 variants of concern in companion animals.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1094-D1101
Author(s):  
Eric Minwei Liu ◽  
Alexander Martinez-Fundichely ◽  
Rajesh Bollapragada ◽  
Maurice Spiewack ◽  
Ekta Khurana

Abstract Most mutations in cancer genomes occur in the non-coding regions with unknown impact on tumor development. Although the increase in the number of cancer whole-genome sequences has revealed numerous putative non-coding cancer drivers, their information is dispersed across multiple studies making it difficult to understand their roles in tumorigenesis of different cancer types. We have developed CNCDatabase, Cornell Non-coding Cancer driver Database (https://cncdatabase.med.cornell.edu/) that contains detailed information about predicted non-coding drivers at gene promoters, 5′ and 3′ UTRs (untranslated regions), enhancers, CTCF insulators and non-coding RNAs. CNCDatabase documents 1111 protein-coding genes and 90 non-coding RNAs with reported drivers in their non-coding regions from 32 cancer types by computational predictions of positive selection using whole-genome sequences; differential gene expression in samples with and without mutations; or another set of experimental validations including luciferase reporter assays and genome editing. The database can be easily modified and scaled as lists of non-coding drivers are revised in the community with larger whole-genome sequencing studies, CRISPR screens and further experimental validations. Overall, CNCDatabase provides a helpful resource for researchers to explore the pathological role of non-coding alterations in human cancers.


2021 ◽  
Author(s):  
Roni Rasnic ◽  
Michal Linial

During the past decade, whole-genome sequencing of tumor biopsies and individuals with congenital disorders highlighted the phenomenon of chromoanagenesis, a single chaotic event of chromosomal rearrangement. Chromoanagenesis was shown to be frequent in many types of cancers, to occur in early stages of cancer development, and significantly impact the tumors nature. However, an in-depth, cancer-type dependent analysis has been somewhat incomplete due to the shortage in whole genome sequencing of cancerous samples. In this study, we extracted data from The Pan-Cancer Analysis of Whole Genome (PCAWG) and The Cancer Genome Atlas (TCGA) to construct a machine learning algorithm that can detect chromoanagenesis with high accuracy (86%). The algorithm was applied to ~10,000 TCGA cancer patients. We utilize the chromoanagenesis assignment results, to analyze cancer-type specific chromoanagenesis characteristics in 20 TCGA cancer types. Our results unveil prominent genes affected in either chromoanagenesis or non-chromoanagenesis tumorigenesis. The analysis reveals a mutual exclusivity relationship between the genes impaired in chromoanagenesis versus non-chromoanagenesis cases. We offer the discovered characteristics as possible targets for cancer diagnostic and therapeutic purposes.


Sign in / Sign up

Export Citation Format

Share Document