Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences

Sushant Kumar; Jonathan Warrell; Shantao Li; Patrick D. McGillivray; William Meyerson; Leonidas Salichos; Arif Harmanci; Alexander Martinez-Fundichely; Calvin W.Y. Chan; Morten Muhlig Nielsen; Lucas Lochovsky; Yan Zhang; Xiaotong Li; Shaoke Lou; Jakob Skou Pedersen; Carl Herrmann; Gad Getz; Ekta Khurana; Mark B. Gerstein

doi:10.1016/j.cell.2020.01.032

Passenger mutations in 2500 cancer genomes: Overall molecular functional impact and consequences

10.1101/280446 ◽

2018 ◽

Cited By ~ 5

Author(s):

Sushant Kumar ◽

Jonathan Warrell ◽

Shantao Li ◽

Patrick D. McGillivray ◽

William Meyerson ◽

...

Keyword(s):

Cancer Progression ◽

Complex Trait ◽

Driver Mutations ◽

Additive Variance ◽

Functional Impact ◽

Cancer Genomes ◽

Classical Models ◽

Cancer Phenotypes ◽

Passenger Mutations ◽

The Impact

AbstractThe Pan-cancer Analysis of Whole Genomes (PCAWG) project provides an unprecedented opportunity to comprehensively characterize a vast set of uniformly annotated coding and non-coding mutations present in thousands of cancer genomes. Classical models of cancer progression posit that only a small number of these mutations strongly drive tumor progression and that the remaining ones (termed “putative passengers”) are inconsequential for tumorigenesis. In this study, we leveraged the comprehensive variant data from PCAWG to ascertain the molecular functional impact of each variant. The impact distribution of PCAWG mutations shows that, in addition to high- and low-impact mutations, there is a group of medium-impact putative passengers predicted to influence gene activity. Moreover, the predicted impact relates to the underlying mutational signature: different signatures confer divergent impact, differentially affecting distinct regulatory subsystems and gene categories. We also find that impact varies based on subclonal architecture (i.e., early vs. late mutations) and can be related to patient survival. Finally, we note that insufficient power due to limited cohort sizes precludes identification of weak drivers using standard recurrence-based approaches. To address this, we adapted an additive effects model derived from complex trait studies to show that aggregating the impact of putative passenger variants (i.e. including yet undetected weak drivers) provides significant predictability for cancer phenotypes beyond the PCAWG identified driver mutations (12.5% additive variance). Furthermore, this framework allowed us to estimate the frequency of potential weak driver mutations in the subset of PCAWG samples lacking well-characterized driver alterations.

Download Full-text

Distribution and functional impact of short tandemduplications in cancer genomes

10.5353/th_991044101378803414 ◽

2018 ◽

Author(s):

Kwok-wai Ng

Keyword(s):

Functional Impact ◽

Cancer Genomes

Download Full-text

DriverPower: Combined burden and functional impact tests for cancer driver discovery

10.1101/215244 ◽

2017 ◽

Cited By ~ 4

Author(s):

Shimin Shuai ◽

Steven Gallinger ◽

Lincoln Stein ◽

Keyword(s):

Driver Mutations ◽

Functional Interpretation ◽

Functional Impact ◽

Genomic Features ◽

Cancer Driver ◽

Mutational Burden ◽

Mutation Model ◽

Whole Genomes ◽

Cancer Genomes ◽

Pan Cancer

AbstractWe describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify cancer driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1,373 genomic features derived from public sources, DriverPower’s background mutation model explains up to 93% of the regional variance in the mutation rate across a variety of tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2,583 cancer genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Group, DriverPower has the highest F1-score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.

Download Full-text

Sequence Neighborhoods Enable Reliable Prediction of Pathogenic Mutations in Cancer Genomes

Cancers ◽

10.3390/cancers13102366 ◽

2021 ◽

Vol 13 (10) ◽

pp. 2366

Author(s):

Shayantan Banerjee ◽

Karthik Raman ◽

Balaraman Ravindran

Keyword(s):

Cancer Progression ◽

Nucleotide Sequences ◽

Feature Representation ◽

Estimation Methods ◽

Driver Mutations ◽

Cancer Mutation ◽

Evolutionary Features ◽

Cancer Genomes ◽

Passenger Mutations ◽

Pan Cancer

Identifying cancer-causing mutations from sequenced cancer genomes hold much promise for targeted therapy and precision medicine. “Driver” mutations are primarily responsible for cancer progression, while “passengers” are functionally neutral. Although several computational approaches have been developed for distinguishing between driver and passenger mutations, very few have concentrated on using the raw nucleotide sequences surrounding a particular mutation as potential features for building predictive models. Using experimentally validated cancer mutation data in this study, we explored various string-based feature representation techniques to incorporate information on the neighborhood bases immediately 5′ and 3′ from each mutated position. Density estimation methods showed significant distributional differences between the neighborhood bases surrounding driver and passenger mutations. Binary classification models derived using repeated cross-validation experiments provided comparable performances across all window sizes. Integrating sequence features derived from raw nucleotide sequences with other genomic, structural, and evolutionary features resulted in the development of a pan-cancer mutation effect prediction tool, NBDriver, which was highly efficient in identifying pathogenic variants from five independent validation datasets. An ensemble predictor obtained by combining the predictions from NBDriver with three other commonly used driver prediction tools (FATHMM (cancer), CONDEL, and MutationTaster) significantly outperformed existing pan-cancer models in prioritizing a literature-curated list of driver and passenger mutations. Using the list of true positive mutation predictions derived from NBDriver, we identified a list of 138 known driver genes with functional evidence from various sources. Overall, our study underscores the efficacy of using raw nucleotide sequences as features to distinguish between driver and passenger mutations from sequenced cancer genomes.

Download Full-text

Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes

10.1101/2021.02.09.430460 ◽

2021 ◽

Author(s):

Shayantan Banerjee ◽

Karthik Raman ◽

Balaraman Ravindran

Keyword(s):

Cancer Progression ◽

Nucleotide Sequences ◽

Feature Representation ◽

Estimation Methods ◽

Driver Mutations ◽

Cancer Mutation ◽

Evolutionary Features ◽

Cancer Genomes ◽

Passenger Mutations ◽

Pan Cancer

AbstractIdentifying cancer-causing mutations from sequenced cancer genomes hold much promise for targeted therapy and precision medicine. “Driver” mutations are primarily responsible for cancer progression, while “passengers” are functionally neutral. Although several computational approaches have been developed for distinguishing between driver and passenger mutations, very few have concentrated on utilizing the raw nucleotide sequences surrounding a particular mutation as potential features for building predictive models. Using experimentally validated cancer mutation data in this study, we explored various string-based feature representation techniques to incorporate information on the neighborhood bases immediately 5’ and 3’ from each mutated position. Density estimation methods showed significant distributional differences between the neighborhood bases surrounding driver and passenger mutations. Binary classification models derived using repeated cross-validation experiments gave comparable performances across all window sizes. Integrating sequence features derived from raw nucleotide sequences with other genomic, structural and evolutionary features resulted in the development of a pan-cancer mutation effect prediction tool, NBDriver, which was highly efficient in identifying pathogenic variants from five independent validation datasets. An ensemble predictor obtained by combining the predictions from NBDriver with two other commonly used driver prediction tools (CONDEL and Mutation Taster) outperformed existing pan-cancer models in prioritizing a literature-curated list of driver and passenger mutations. Using the list of true positive mutation predictions derived from NBDriver, we identified a list of 138 known driver genes with functional evidence from various sources. Overall, our study underscores the efficacy of utilizing raw nucleotide sequences as features to distinguish between driver and passenger mutations from sequenced cancer genomes.

Download Full-text

Review of applications of CRISPR-Cas9 gene-editing technology in cancer research

Biological Procedures Online ◽

10.1186/s12575-021-00151-x ◽

2021 ◽

Vol 23 (1) ◽

Author(s):

Ziyi Zhao ◽

Chenxi Li ◽

Fei Tong ◽

Jingkuang Deng ◽

Guofu Huang ◽

...

Keyword(s):

Cancer Research ◽

Cancer Progression ◽

Tumor Suppressors ◽

Genetic Screening ◽

Gene Editing ◽

Causes Of Death ◽

Cancer Models ◽

Cancer Genomes ◽

Passenger Mutations ◽

Cancer Studies

AbstractCharacterized by multiple complex mutations, including activation by oncogenes and inhibition by tumor suppressors, cancer is one of the leading causes of death. Application of CRISPR-Cas9 gene-editing technology in cancer research has aroused great interest, promoting the exploration of the molecular mechanism of cancer progression and development of precise therapy. CRISPR-Cas9 gene-editing technology provides a solid basis for identifying driver and passenger mutations in cancer genomes, which is of great value in genetic screening and for developing cancer models and treatments. This article reviews the current applications of CRISPR-Cas9 gene-editing technology in various cancer studies, the challenges faced, and the existing solutions, highlighting the potential of this technology for cancer treatment.

Download Full-text

The landscape and driver potential of site-specific hotspots across cancer genomes

npj Genomic Medicine ◽

10.1038/s41525-021-00197-6 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Randi Istrup Juul ◽

Morten Muhlig Nielsen ◽

Malene Juul ◽

Lars Feuerbach ◽

Jakob Skou Pedersen

Keyword(s):

Transcription Factor ◽

Positive Selection ◽

Protein Coding ◽

Site Specific ◽

Functional Impact ◽

Factor Binding ◽

Coding Regions ◽

Genome Wide ◽

Cancer Genomes ◽

Gene Regulatory

AbstractLarge sets of whole cancer genomes make it possible to study mutation hotspots genome-wide. Here we detect, categorize, and characterize site-specific hotspots using 2279 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes project and provide a resource of annotated hotspots genome-wide. We investigate the excess of hotspots in both protein-coding and gene regulatory regions and develop measures of positive selection and functional impact for individual hotspots. Using cancer allele fractions, expression aberrations, mutational signatures, and a variety of genomic features, such as potential gain or loss of transcription factor binding sites, we annotate and prioritize all highly mutated hotspots. Genome-wide we find more high-frequency SNV and indel hotspots than expected given mutational background models. Protein-coding regions are generally enriched for SNV hotspots compared to other regions. Gene regulatory hotspots show enrichment of potential same-patient second-hit missense mutations, consistent with enrichment of hotspot driver mutations compared to singletons. For protein-coding regions, splice-sites, promoters, and enhancers, we see an excess of hotspots associated with cancer genes. Interestingly, missense hotspot mutations in tumor suppressors are associated with elevated expression, suggesting localized amino-acid changes with functional impact. For individual non-coding hotspots, only a small number show clear signs of positive selection, including known sites in the TERT promoter and the 5’ UTR of TP53. Most of the new candidates have few mutations and limited driver evidence. However, a hotspot in an enhancer of the oncogene POU2AF1, which may create a transcription factor binding site, presents multiple lines of driver-consistent evidence.

Download Full-text

Mutational likeliness and entropy help to identify driver mutations and their functional role in cancer

10.1101/354324 ◽

2018 ◽

Author(s):

Giorgio Mattiuz ◽

Salvatore Di Giorgio ◽

Lorenzo Tofani ◽

Antonio Frandi ◽

Francesco Donati ◽

...

Keyword(s):

Cancer Progression ◽

Somatic Mutations ◽

Driver Mutations ◽

Cancer Evolution ◽

Loss Of Function ◽

Driver Genes ◽

Cancer Driver ◽

Cancer Genomes ◽

Passenger Mutations ◽

Mutational Processes

AbstractAlterations in cancer genomes originate from mutational processes taking place throughout oncogenesis and cancer progression. We show that likeliness and entropy are two properties of somatic mutations crucial in cancer evolution, as cancer-driver mutations stand out, with respect to both of these properties, as being distinct from the bulk of passenger mutations. Our analysis can identify novel cancer driver genes and differentiate between gain and loss of function mutations.

Download Full-text

ActiveDriverDB: Interpreting Genetic Variation in Human and Cancer Genomes Using Post-translational Modification Sites and Signaling Networks (2021 Update)

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.626821 ◽

2021 ◽

Vol 9 ◽

Author(s):

Michal Krassowski ◽

Diogo Pellegrina ◽

Miles W. Mee ◽

Amelie Fradet-Turcotte ◽

Mamatha Bhat ◽

...

Keyword(s):

Genetic Variation ◽

Amino Acid ◽

Human Population ◽

Signaling Networks ◽

Amino Acid Substitutions ◽

Inherited Disease ◽

Post Translational Modification ◽

Functional Impact ◽

Disease Mutations ◽

Cancer Genomes

Deciphering the functional impact of genetic variation is required to understand phenotypic diversity and the molecular mechanisms of inherited disease and cancer. While millions of genetic variants are now mapped in genome sequencing projects, distinguishing functional variants remains a major challenge. Protein-coding variation can be interpreted using post-translational modification (PTM) sites that are core components of cellular signaling networks controlling molecular processes and pathways. ActiveDriverDB is an interactive proteo-genomics database that uses more than 260,000 experimentally detected PTM sites to predict the functional impact of genetic variation in disease, cancer and the human population. Using machine learning tools, we prioritize proteins and pathways with enriched PTM-specific amino acid substitutions that potentially rewire signaling networks via induced or disrupted short linear motifs of kinase binding. We then map these effects to site-specific protein interaction networks and drug targets. In the 2021 update, we increased the PTM datasets by nearly 50%, included glycosylation, sumoylation and succinylation as new types of PTMs, and updated the workflows to interpret inherited disease mutations. We added a recent phosphoproteomics dataset reflecting the cellular response to SARS-CoV-2 to predict the impact of human genetic variation on COVID-19 infection and disease course. Overall, we estimate that 16-21% of known amino acid substitutions affect PTM sites among pathogenic disease mutations, somatic mutations in cancer genomes and germline variants in the human population. These data underline the potential of interpreting genetic variation through the lens of PTMs and signaling networks. The open-source database is freely available at www.ActiveDriverDB.org.

Download Full-text