Estimation of Hominoid Ancestral Population Sizes under Bayesian Coalescent Models Incorporating Mutation Rate Variation and Sequencing Errors

Across indepedent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either i) drivers of cancer that are postively selected during oncogenesis, ii) due to mutation rate variation, or iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likly to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ~4% of all SNVs are error in this dataset, but that the rate of error varies by thousands-of-fold.

Download Full-text

Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

10.7287/peerj.preprints.2089v1 ◽

2016 ◽

Author(s):

Thomas C A Smith ◽

Antony M Carr ◽

Adam C Eyre-Walker

Keyword(s):

Mutation Rate ◽

Cancer Genome ◽

Rate Variation ◽

Single Nucleotide Variants ◽

Genome Sequences ◽

Single Nucleotide ◽

Sequencing Errors ◽

Cancer Genomes ◽

Mutation Rate Variation

Across indepedent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either i) drivers of cancer that are postively selected during oncogenesis, ii) due to mutation rate variation, or iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likly to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ~4% of all SNVs are error in this dataset, but that the rate of error varies by thousands-of-fold.

Download Full-text

Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

PeerJ ◽

10.7717/peerj.2391 ◽

2016 ◽

Vol 4 ◽

pp. e2391 ◽

Cited By ~ 2

Author(s):

Thomas C.A. Smith ◽

Antony M. Carr ◽

Adam C. Eyre-Walker

Keyword(s):

Mutation Rate ◽

Cancer Genome ◽

Rate Variation ◽

Single Nucleotide Variants ◽

Genome Sequences ◽

Single Nucleotide ◽

Sequencing Errors ◽

Cancer Genomes ◽

Mutation Rate Variation

Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ∼4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites.

Download Full-text

Nucleosome positioning stability is a modulator of germline mutation rate variation across the human genome

Nature Communications ◽

10.1038/s41467-020-15185-0 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Cai Li ◽

Nicholas M. Luscombe

Keyword(s):

Human Genome ◽

Mutation Rate ◽

Germline Mutation ◽

Nucleosome Positioning ◽

Rate Variation ◽

Mutation Rate Variation

Download Full-text

On the estimation of ancestral population sizes of modern humans

Genetics Research ◽

10.1017/s001667239700270x ◽

1997 ◽

Vol 69 (2) ◽

pp. 111-116 ◽

Cited By ~ 40

Author(s):

ZIHENG YANG

Keyword(s):

Population Size ◽

Sequence Data ◽

Divergence Time ◽

Ancestral Population ◽

Rate Variation ◽

Effective Population ◽

Modern Humans ◽

Extant Species ◽

Population Sizes ◽

Highly Correlated

The theory developed by Takahata and colleagues for estimating the effective population size of ancestral species using homologous sequences from closely related extant species was extended to take account of variation of evolutionary rates among loci. Nuclear sequence data related to the evolution of modern humans were reanalysed and computer simulations were performed to examine the effect of rate variation on estimation of ancestral population sizes. It is found that the among-locus rate variation does not have a significant effect on estimation of the current population size when sequences from multiple loci are sampled from the same species, but does have a significant effect on estimation of the ancestral population size using sequences from different species. The effects of ancestral population size, species divergence time and among-locus rate variation are found to be highly correlated, and to achieve reliable estimates of the ancestral population size, effects of the other two factors should be estimated independently.

Download Full-text

Minisatellite mutation rate variation associated with a flanking DNA sequence polymorphism

Nature Genetics ◽

10.1038/ng1094-162 ◽

1994 ◽

Vol 8 (2) ◽

pp. 162-170 ◽

Cited By ~ 62

Author(s):

Darren G. Monckton ◽

Rita Neumann ◽

Tara Guram ◽

Neale Fretwell ◽

Keiji Tamaki ◽

...

Keyword(s):

Dna Sequence ◽

Mutation Rate ◽

Sequence Polymorphism ◽

Rate Variation ◽

Dna Sequence Polymorphism ◽

Mutation Rate Variation

Download Full-text

Mutation Biases and Mutation Rate Variation Around Very Short Human Microsatellites Revealed by Human–Chimpanzee–Orangutan Genomic Sequence Alignments

Journal of Molecular Evolution ◽

10.1007/s00239-010-9377-4 ◽

2010 ◽

Vol 71 (3) ◽

pp. 192-201 ◽

Cited By ~ 14

Author(s):

William Amos

Keyword(s):

Mutation Rate ◽

Genomic Sequence ◽

Rate Variation ◽

Sequence Alignments ◽

Mutation Rate Variation

Download Full-text

Mutation rate variation in the hypervariable VNTR g3 (D7S22) is associated with a flanking DNA sequence polymorphism near the repeat array

16th Congress of the International Society for Forensic Haemogenetics (Internationale Gesellschaft für forensische Hämogenetik e.V.), Santiago de Compostela, 12–16 September 1995 - Advances in Forensic Haemogenetics ◽

10.1007/978-3-642-80029-0_42 ◽

1996 ◽

pp. 160-162

Author(s):

Rune Andreassen ◽

Bjørnar Olaisen

Keyword(s):

Dna Sequence ◽

Mutation Rate ◽

Sequence Polymorphism ◽

Rate Variation ◽

Repeat Array ◽

Dna Sequence Polymorphism ◽

Mutation Rate Variation

Download Full-text

Reply to: Mutation rate variation in eukaryotes: evolutionary implications of site-specific mechanisms

Nature Reviews Genetics ◽

10.1038/nrg2158-c2 ◽

2007 ◽

Vol 8 (11) ◽

pp. 902-902

Author(s):

Charles F. Baer ◽

Michael M. Miyamoto ◽

Dee R. Denver

Keyword(s):

Mutation Rate ◽

Rate Variation ◽

Site Specific ◽

Mutation Rate Variation

Download Full-text

Robustness of Coalescent Estimators to Between-Lineage Mutation Rate Variation

Molecular Biology and Evolution ◽

10.1093/molbev/msl106 ◽

2006 ◽

Vol 23 (12) ◽

pp. 2355-2360 ◽

Cited By ~ 6

Author(s):

M. K. Kuhner

Keyword(s):

Mutation Rate ◽

Rate Variation ◽

Mutation Rate Variation

Download Full-text