Somatically Mutated Genes Under Positive and Negative Selection Found by Transcriptome Sequence Analysis Include Oncogene and Tumor Suppressor Candidates
AbstractIntroductionOncogenic somatic mutations confer proliferative advantage and undergo positive clonal selection. We developed software and applied new analytical approaches to identify: (1) somatic mutations in diverse tissues, (2) somatically mutated genes under positive and negative selection, (3) post-transcriptional modifications in the mitochondrial transcriptome, and (4) inherited germline alleles predisposing people to higher somatic mutation burden or higher levels of post-transcriptional modification.MethodsTranscriptome sequence data (Genotype Tissue Expression project) for 7051 tissue samples from 549 postmortem donors and representing 44 tissue types were used. Germline mutations were inferred from whole-exome DNA sequencing and SNP arrays. DNA somatic mutations were inferred from variant allele frequencies (VAF) in RNA-seq data. Post-transcriptional modifications were inferred from Polymorphism Information Content (PIC) at the p9 sites of mitochondrial tRNA sequences. Positive and negative clonal selection was evaluated using a nonsynonomous/synonomous mutation rate (dN/dS) model. Genome-wide association studies (GWAS) were assessed with mitochondrial PIC for post-transcriptional modification level, or using the total number of somatic mutations observed per donor for somatic mutation burden.ResultsOur dN/dS model identified 78 genes under negative selection for somatic mutations (dN/dS < 1, padj< 0.05) and 14 under positive selection (dN/dS > 1, padj<0.05). Our GWAS identified 2 sites associated with post-transcriptional modification (1 approaching significance with p=5.99×10−8, 1 with p<5×10−8) and ∼20 sites associated with somatic mutation burden (p<5×10−8).ConclusionsTo our knowledge these are the first genome-wide association studies on normal somatic mutation burden. These studies were an attempt to increase understanding of the somatic mutation process. Our work identified somatic mutations at the global organismal level that may promote cell proliferation in a tissue-specific manner. By identifying tissue-specific mutations in actively expressed genes that appear before cancer phenotype is detected, this work also identifies gene candidates that might initiate tumorigenesis.