scholarly journals Inference of distribution of fitness effects and proportion of adaptive substitutions from polymorphism data

2016 ◽  
Author(s):  
Paula Tataru ◽  
Maéva Mollion ◽  
Sylvain Glemin ◽  
Thomas Bataillon

ABSTRACTThe distribution of fitness effects (DFE) encompasses deleterious, neutral and beneficial mutations. It conditions the evolutionary trajectory of populations, as well as the rate of adaptive molecular evolution (α). Inference of DFE and α from patterns of polymorphism (SFS) and divergence data has been a longstanding goal of evolutionary genetics. A widespread assumption shared by numerous methods developed so far to infer DFE and α from such data is that beneficial mutations contribute only negligibly to the polymorphism data. Hence, a DFE comprising only deleterious mutations tends to be estimated from SFS data, and α is only predicted by contrasting the SFS with divergence data from an outgroup. Here, we develop a hierarchical probabilistic framework that extends on previous methods and also can infer DFE and α from polymorphism data alone. We use extensive simulations to examine the performance of our method. We show that both a full DFE, comprising both deleterious and beneficial mutations, and α can be inferred without resorting to divergence data. We demonstrate that inference of DFE from polymorphism data alone can in fact provide more reliable estimates, as it does not rely on strong assumptions about a shared DFE between the outgroup and ingroup species used to obtain the SFS and divergence data. We also show that not accounting for the contribution of beneficial mutations to polymorphism data leads to substantially biased estimates of the DFE and α. We illustrate these points using our newly developed framework, while also comparing to one of the most widely used inference methods available.

2021 ◽  
Author(s):  
Grace Avecilla ◽  
Julie Chuong ◽  
Fangfei Li ◽  
Gavin J Sherlock ◽  
David Gresham ◽  
...  

The rate of adaptive evolution depends on the rate at which beneficial mutations are introduced into a population and the fitness effects of those mutations. The rate of beneficial mutations and their expected fitness effects is often difficult to empirically quantify. As these two parameters determine the pace of evolutionary change in a population, the dynamics of adaptive evolution may enable inference of their values. Copy number variants (CNVs) are a pervasive source of heritable variation that can facilitate rapid adaptive evolution. Previously, we developed a locus-specific fluorescent CNV reporter to quantify CNV dynamics in evolving populations maintained in nutrient-limiting conditions using chemostats. Here, we use the observed CNV adaptation dynamics to estimate the rate at which beneficial CNVs are introduced through de novo mutation and their fitness effects using simulation-based Bayesian likelihood-free inference approaches. We tested the suitability of two evolutionary models: a standard Wright-Fisher model and a chemostat growth model. We evaluated two likelihood-free inference algorithms: the well-established Approximate Bayesian Computation with Sequential Monte Carlo (ABC-SMC) algorithm, and the recently developed Neural Posterior Estimation (NPE) algorithm, which applies an artificial neural network to directly estimate the posterior distribution. By systematically evaluating the suitability of different inference methods and models we show that NPE has several advantages over ABC-SMC and that a Wright-Fisher evolutionary model suffices in most cases. Using our validated inference framework, we estimate the CNV formation rate at the GAP1 locus in yeast as 10-4.7 -10-4 per cell division, and a selection coefficient of 0.04 - 0.1 per generation for GAP1 CNVs in glutamine-limited chemostats. We experimentally validated our estimates using barcode lineage tracking and pairwise fitness assays. Our results are consistent with a high beneficial CNV supply rate that is 10-fold greater than the estimated rates of beneficial single-nucleotide mutations, explaining their outsized importance in rapid adaptive evolution. More generally, our study demonstrates the utility of novel simulation-based likelihood-free inference methods for inferring the rates and effects of evolutionary processes from empirical data.


2020 ◽  
Vol 10 (7) ◽  
pp. 2317-2326 ◽  
Author(s):  
Tom R. Booker

Characterizing the distribution of fitness effects (DFE) for new mutations is central in evolutionary genetics. Analysis of molecular data under the McDonald-Kreitman test has suggested that adaptive substitutions make a substantial contribution to between-species divergence. Methods have been proposed to estimate the parameters of the distribution of fitness effects for positively selected mutations from the unfolded site frequency spectrum (uSFS). Such methods perform well when beneficial mutations are mildly selected and frequent. However, when beneficial mutations are strongly selected and rare, they may make little contribution to standing variation and will thus be difficult to detect from the uSFS. In this study, I analyze uSFS data from simulated populations subject to advantageous mutations with effects on fitness ranging from mildly to strongly beneficial. As expected, frequent, mildly beneficial mutations contribute substantially to standing genetic variation and parameters are accurately recovered from the uSFS. However, when advantageous mutations are strongly selected and rare, there are very few segregating in populations at any one time. Fitting the uSFS in such cases leads to underestimates of the strength of positive selection and may lead researchers to false conclusions regarding the relative contribution adaptive mutations make to molecular evolution. Fortunately, the parameters for the distribution of fitness effects for harmful mutations are estimated with high accuracy and precision. The results from this study suggest that the parameters of positively selected mutations obtained by analysis of the uSFS should be treated with caution and that variability at linked sites should be used in conjunction with standing variability to estimate parameters of the distribution of fitness effects in the future.


2019 ◽  
Author(s):  
Tom R. Booker

AbstractCharacterising the distribution of fitness effects (DFE) for new mutations is central in evolutionary genetics. Analysis of molecular data under the McDonald-Kreitman test has suggested that adaptive substitutions make a substantial contribution to between-species divergence. Methods have been proposed to estimate the parameters of the distribution of fitness effects for positively selected mutations from the unfolded site frequency spectrum (uSFS). However, when beneficial mutations are strongly selected and rare, they may make little contribution to standing variation and will thus be difficult to detect from the uSFS. In this study, I analyse uSFS data from simulated populations subject to advantageous mutations with effects on fitness ranging from mildly to strongly beneficial. When advantageous mutations are strongly selected and rare, there are very few segregating in populations at any one time. Fitting the uSFS in such cases leads to underestimates of the strength of positive selection and may lead researchers to false conclusions regarding the relative contribution adaptive mutations make to molecular evolution. Fortunately, the parameters for the distribution of fitness effects for harmful mutations are estimated with high accuracy and precision. The results from this study suggest that the parameters of positively selected mutations obtained by analysis of the uSFS should be treated with caution and that variability at linked sites should be used in conjunction with standing variability to estimate parameters of the distribution of fitness effects in the future.


2016 ◽  
Author(s):  
Sophie Pénisson ◽  
Tanya Singh ◽  
Paul Sniegowski ◽  
Philip Gerrish

ABSTRACTBeneficial mutations drive adaptive evolution, yet their selective advantage does not ensure their fixation. Haldane’s application of single-type branching process theory showed that genetic drift alone could cause the extinction of newly-arising beneficial mutations with high probability. With linkage, deleterious mutations will affect the dynamics of beneficial mutations and might further increase their extinction probability. Here, we model the lineage dynamics of a newly-arising beneficial mutation as a multitype branching process; this approach allows us to account for the combined effects of drift and the stochastic accumulation of linked deleterious mutations, which we call lineage contamination. We first study the lineage contamination phenomenon in isolation, deriving extinction times and probabilities of beneficial lineages. We then put the lineage contamination phenomenon into the context of an evolving population by incorporating the effects of background selection. We find that the survival probability of beneficial mutations is simply Haldane’s classical formula multiplied by the correction factor , where U is deleterious mutation rate, is mean selective advantage of beneficial mutations, κ ∈ (1, ε], and ε = 2 – e−1. We also find there exists a genomic deleterious mutation rate, , that maximizes the rate of production of surviving beneficial mutations, and that . Both of these results, and others, are curiously independent of the fitness effects of deleterious mutations. We derive critical mutation rates above which: 1) lineage contamination alleviates competition among beneficial mutations, and 2) the adaptive substitution process all but shuts down.


2006 ◽  
Vol 2 (3) ◽  
pp. 426-430 ◽  
Author(s):  
Laurence Loewe ◽  
Brian Charlesworth

The properties of the distribution of deleterious mutational effects on fitness (DDME) are of fundamental importance for evolutionary genetics. Since it is extremely difficult to determine the nature of this distribution, several methods using various assumptions about the DDME have been developed, for the purpose of parameter estimation. We apply a newly developed method to DNA sequence polymorphism data from two Drosophila species and compare estimates of the parameters of the distribution of the heterozygous fitness effects of amino acid mutations for several different distribution functions. The results exclude normal and gamma distributions, since these predict too few effectively lethal mutations and power-law distributions as a result of predicting too many lethals. Only the lognormal distribution appears to fit both the diversity data and the frequency of lethals. This DDME arises naturally in complex systems when independent factors contribute multiplicatively to an increase in fitness-reducing damage. Several important parameters, such as the fraction of effectively neutral non-synonymous mutations and the harmonic mean of non-neutral selection coefficients, are robust to the form of the DDME. Our results suggest that the majority of non-synonymous mutations in Drosophila are under effective purifying selection.


Genetics ◽  
1998 ◽  
Vol 149 (4) ◽  
pp. 2089-2097 ◽  
Author(s):  
Jody Hey

Abstract If multiple linked polymorphisms are under natural selection, then conflicts arise and the efficiency of natural selection is hindered relative to the case of no linkage. This simple interaction between linkage and natural selection creates an opportunity for mutations that raise the level of recombination to increase in frequency and have an enhanced chance of fixation. This important finding by S. Otto and N. Barton means that mutations that raise the recombination rate, but are otherwise neutral, will be selectively favored under fairly general circumstances of multilocus selection and linkage. The effect described by Otto and Barton, which was limited to neutral modifiers, can also be extended to include all modifiers of recombination, both beneficial and deleterious. Computer simulations show that beneficial mutations that also increase recombination have an increased chance of fixation. Similarly, deleterious mutations that also decrease recombination have an increased chance of fixation. The results suggest that a simple model of recombination modifiers, including both neutral and pleiotropic modifiers, is a necessary explanation for the evolutionary origin of recombination.


Genetics ◽  
2003 ◽  
Vol 164 (3) ◽  
pp. 1099-1118 ◽  
Author(s):  
Sarah P Otto

AbstractIn diploids, sexual reproduction promotes both the segregation of alleles at the same locus and the recombination of alleles at different loci. This article is the first to investigate the possibility that sex might have evolved and been maintained to promote segregation, using a model that incorporates both a general selection regime and modifier alleles that alter an individual’s allocation to sexual vs. asexual reproduction. The fate of different modifier alleles was found to depend strongly on the strength of selection at fitness loci and on the presence of inbreeding among individuals undergoing sexual reproduction. When selection is weak and mating occurs randomly among sexually produced gametes, reductions in the occurrence of sex are favored, but the genome-wide strength of selection is extremely small. In contrast, when selection is weak and some inbreeding occurs among gametes, increased allocation to sexual reproduction is expected as long as deleterious mutations are partially recessive and/or beneficial mutations are partially dominant. Under strong selection, the conditions under which increased allocation to sex evolves are reversed. Because deleterious mutations are typically considered to be partially recessive and weakly selected and because most populations exhibit some degree of inbreeding, this model predicts that higher frequencies of sex would evolve and be maintained as a consequence of the effects of segregation. Even with low levels of inbreeding, selection is stronger on a modifier that promotes segregation than on a modifier that promotes recombination, suggesting that the benefits of segregation are more likely than the benefits of recombination to have driven the evolution of sexual reproduction in diploids.


Genetics ◽  
2000 ◽  
Vol 154 (3) ◽  
pp. 1403-1417 ◽  
Author(s):  
David J Cutler

Abstract Rates of molecular evolution at some protein-encoding loci are more irregular than expected under a simple neutral model of molecular evolution. This pattern of excessive irregularity in protein substitutions is often called the “overdispersed molecular clock” and is characterized by an index of dispersion, R(T) > 1. Assuming infinite sites, no recombination model of the gene R(T) is given for a general stationary model of molecular evolution. R(T) is shown to be affected by only three things: fluctuations that occur on a very slow time scale, advantageous or deleterious mutations, and interactions between mutations. In the absence of interactions, advantageous mutations are shown to lower R(T); deleterious mutations are shown to raise it. Previously described models for the overdispersed molecular clock are analyzed in terms of this work as are a few very simple new models. A model of deleterious mutations is shown to be sufficient to explain the observed values of R(T). Our current best estimates of R(T) suggest that either most mutations are deleterious or some key population parameter changes on a very slow time scale. No other interpretations seem plausible. Finally, a comment is made on how R(T) might be used to distinguish selective sweeps from background selection.


Science ◽  
2019 ◽  
Vol 366 (6464) ◽  
pp. 490-493 ◽  
Author(s):  
Milo S. Johnson ◽  
Alena Martsul ◽  
Sergey Kryazhimskiy ◽  
Michael M. Desai

Natural selection drives populations toward higher fitness, but second-order selection for adaptability and mutational robustness can also influence evolution. In many microbial systems, diminishing-returns epistasis contributes to a tendency for more-fit genotypes to be less adaptable, but no analogous patterns for robustness are known. To understand how robustness varies across genotypes, we measure the fitness effects of hundreds of individual insertion mutations in a panel of yeast strains. We find that more-fit strains are less robust: They have distributions of fitness effects with lower mean and higher variance. These differences arise because many mutations have more strongly deleterious effects in faster-growing strains. This negative correlation between fitness and robustness implies that second-order selection for robustness will tend to conflict with first-order selection for fitness.


Sign in / Sign up

Export Citation Format

Share Document