scholarly journals Inference of the distribution of selection coefficients for new nonsynonymous mutations using large samples

2016 ◽  
Author(s):  
Bernard Y. Kim ◽  
Christian D. Huber ◽  
Kirk E. Lohmueller

ABSTRACTThe distribution of fitness effects (DFE) has considerable importance in population genetics. To date, estimates of the DFE come from studies using a small number of individuals. Thus, estimates of the proportion of moderately to strongly deleterious new mutations may be unreliable because such variants are unlikely to be segregating in the data. Additionally, the true functional form of the DFE is unknown, and estimates of the DFE differ significantly between studies. Here we present a flexible and computationally tractable method, called Fit∂a∂i, to estimate the DFE using the site frequency spectrum from a large number of individuals. We apply our approach to the frequency spectrum of 1300 Europeans from the Exome Sequencing Project ESP6400 dataset, 1298 Danes from the LuCamp dataset, and 432 Europeans from the 1000 Genomes Project to estimate the DFE of deleterious nonsynonymous mutations. We infer significantly fewer (0.38-0.84x) strongly deleterious mutations with selection coefficient |s| > 0.01 and more (1.24-1.43x) weakly deleterious mutations with selection coefficient |s| < 0.001 compared to previous estimates. Furthermore, a DFE that is a mixture distribution of a point mass at neutrality plus a gamma distribution fits best to two of the three datasets. Our results suggest that nearly neutral forces play a larger role in human evolution than previously thought.

2017 ◽  
Author(s):  
Christian D. Huber ◽  
Arun Durvasula ◽  
Angela M. Hancock ◽  
Kirk E. Lohmueller

AbstractDominance is a fundamental concept in molecular genetics and has implications for understanding patterns of genetic variation, evolution, and complex traits. However, despite its importance, the degree of dominance has yet to be quantified in natural populations. Here, we leverage multiple mating systems in natural populations of Arabidopsis to co-estimate the distribution of fitness effects and dominance coefficients of new amino acid changing mutations. We find that more deleterious mutations are more likely to be recessive than less deleterious mutations. Further, this pattern holds across gene categories, but varies with the connectivity and expression patterns of genes. Our work argues that dominance arose as the inevitable consequence of the functional importance of genes and their optimal expression levels.One sentence summaryWe use population genomic data to characterize the degree of dominance for new mutations and develop a new theory for its evolution.


2017 ◽  
Author(s):  
Bernard Y. Kim ◽  
Christian D. Huber ◽  
Kirk E. Lohmueller

AbstractWhile it is appreciated that population size changes can impact patterns of deleterious variation in natural populations, less attention has been paid to how population admixture affects the dynamics of deleterious variation. Here we use population genetic simulations to examine how admixture impacts deleterious variation under a variety of demographic scenarios, dominance coefficients, and recombination rates. Our results show that gene flow between populations can temporarily reduce the genetic load of smaller populations, especially if deleterious mutations are recessive. Additionally, when fitness effects of new mutations are recessive, between-population differences in the sites at which deleterious variants exist creates heterosis in hybrid individuals. This can lead to an increase in introgressed ancestry, particularly when recombination rates are low. Under certain scenarios, introgressed ancestry can increase from an initial frequency of 5% to 30-75% and fix at many loci, even in the absence of beneficial mutations. Further, deleterious variation and admixture can generate correlations between the frequency of introgressed ancestry and recombination rate or exon density, even in the absence of other types of selection. The direction of these correlations is determined by the specific demography and whether mutations are additive or recessive. Therefore, it is essential that null models include both demography and deleterious variation before invoking reproductive incompatibilities or adaptive introgression to explain unusual patterns of genetic variation.


2020 ◽  
Author(s):  
Kimberly J. Gilbert ◽  
Stefan Zdraljevic ◽  
Daniel E. Cook ◽  
Asher D. Cutter ◽  
Erik C. Andersen ◽  
...  

ABSTRACTThe distribution of fitness effects for new mutations is one of the most theoretically important but difficult to estimate properties in population genetics. A crucial challenge to inferring the distribution of fitness effects (DFE) from natural genetic variation is the sensitivity of the site frequency spectrum to factors like population size change, population substructure, and non-random mating. Although inference methods aim to control for population size changes, the influence of non-random mating remains incompletely understood, despite being a common feature of many species. We report the distribution of fitness effects estimated from 326 genomes of Caenorhabditis elegans, a nematode roundworm with a high rate of self-fertilization. We evaluate the robustness of DFE inferences using simulated data that mimics the genomic structure and reproductive life history of C. elegans. Our observations demonstrate how the combined influence of self-fertilization, genome structure, and natural selection can conspire to compromise estimates of the DFE from extant polymorphisms. These factors together tend to bias inferences towards weakly deleterious mutations, making it challenging to have full confidence in the inferred DFE of new mutations as deduced from standing genetic variation in species like C. elegans. Improved methods for inferring the distribution of fitness effects are needed to appropriately handle strong linked selection and selfing. These results highlight the importance of understanding the combined effects of processes that can bias our interpretations of evolution in natural populations.


2016 ◽  
Author(s):  
Paula Tataru ◽  
Maéva Mollion ◽  
Sylvain Glemin ◽  
Thomas Bataillon

ABSTRACTThe distribution of fitness effects (DFE) encompasses deleterious, neutral and beneficial mutations. It conditions the evolutionary trajectory of populations, as well as the rate of adaptive molecular evolution (α). Inference of DFE and α from patterns of polymorphism (SFS) and divergence data has been a longstanding goal of evolutionary genetics. A widespread assumption shared by numerous methods developed so far to infer DFE and α from such data is that beneficial mutations contribute only negligibly to the polymorphism data. Hence, a DFE comprising only deleterious mutations tends to be estimated from SFS data, and α is only predicted by contrasting the SFS with divergence data from an outgroup. Here, we develop a hierarchical probabilistic framework that extends on previous methods and also can infer DFE and α from polymorphism data alone. We use extensive simulations to examine the performance of our method. We show that both a full DFE, comprising both deleterious and beneficial mutations, and α can be inferred without resorting to divergence data. We demonstrate that inference of DFE from polymorphism data alone can in fact provide more reliable estimates, as it does not rely on strong assumptions about a shared DFE between the outgroup and ingroup species used to obtain the SFS and divergence data. We also show that not accounting for the contribution of beneficial mutations to polymorphism data leads to substantially biased estimates of the DFE and α. We illustrate these points using our newly developed framework, while also comparing to one of the most widely used inference methods available.


2018 ◽  
Author(s):  
Christelle Fraïsse ◽  
Camille Roux ◽  
Pierre-Alexandre Gagnaire ◽  
Jonathan Romiguier ◽  
Nicolas Faivre ◽  
...  

AbstractGenome-scale diversity data are increasingly available in a variety of biological systems, and can be used to reconstruct the past evolutionary history of species divergence. However, extracting the full demographic information from these data is not trivial, and requires inferential methods that account for the diversity of coalescent histories throughout the genome. Here, we evaluate the potential and limitations of one such approach. We reexamine a well-known system of mussel sister species, using the joint site frequency spectrum (jSFS) of synonymous mutations computed either from exome capture or RNA-seq, in an Approximate Bayesian Computation (ABC) framework. We first assess the best sampling strategy (number of: individuals, loci, and bins in the jSFS), and show that model selection is robust to variation in the number of individuals and loci. In contrast, different binning choices when summarizing the joint site frequency spectrum, strongly affect the results: including classes of low and high frequency shared polymorphisms can more effectively reveal recent migration events. We then take advantage of the flexibility of ABC to compare more realistic models of speciation, including variation in migration rates through time (i.e. periodic connectivity) and across genes (i.e. genome-wide heterogeneity in migration rates). We show that these models were consistently selected as the most probable, suggesting that mussels have experienced a complex history of gene flow during divergence and that the species boundary is semi-permeable. Our work provides a comprehensive evaluation of ABC demographic inference in mussels based on the coding site frequency spectrum, and supplies guidelines for employing different sequencing techniques and sampling strategies. We emphasize, perhaps surprisingly, that inferences are less limited by the volume of data, than by the way in which they are analyzed.


Genetics ◽  
2001 ◽  
Vol 158 (2) ◽  
pp. 657-665 ◽  
Author(s):  
Peter Andolfatto ◽  
Molly Przeworski

AbstractA correlation between diversity levels and rates of recombination is predicted both by models of positive selection, such as hitchhiking associated with the rapid fixation of advantageous mutations, and by models of purifying selection against strongly deleterious mutations (commonly referred to as “background selection”). With parameter values appropriate for Drosophila populations, only the first class of models predicts a marked skew in the frequency spectrum of linked neutral variants, relative to a neutral model. Here, we consider 29 loci scattered throughout the Drosophila melanogaster genome. We show that, in African populations, a summary of the frequency spectrum of polymorphic mutations is positively correlated with the meiotic rate of crossing over. This pattern is demonstrated to be unlikely under a model of background selection. Models of weakly deleterious selection are not expected to produce both the observed correlation and the extent to which nucleotide diversity is reduced in regions of low (but nonzero) recombination. Thus, of existing models, hitchhiking due to the recurrent fixation of advantageous variants is the most plausible explanation for the data.


Science ◽  
2019 ◽  
Vol 366 (6464) ◽  
pp. 490-493 ◽  
Author(s):  
Milo S. Johnson ◽  
Alena Martsul ◽  
Sergey Kryazhimskiy ◽  
Michael M. Desai

Natural selection drives populations toward higher fitness, but second-order selection for adaptability and mutational robustness can also influence evolution. In many microbial systems, diminishing-returns epistasis contributes to a tendency for more-fit genotypes to be less adaptable, but no analogous patterns for robustness are known. To understand how robustness varies across genotypes, we measure the fitness effects of hundreds of individual insertion mutations in a panel of yeast strains. We find that more-fit strains are less robust: They have distributions of fitness effects with lower mean and higher variance. These differences arise because many mutations have more strongly deleterious effects in faster-growing strains. This negative correlation between fitness and robustness implies that second-order selection for robustness will tend to conflict with first-order selection for fitness.


2007 ◽  
Vol 8 (8) ◽  
pp. 610-618 ◽  
Author(s):  
Adam Eyre-Walker ◽  
Peter D. Keightley

2010 ◽  
Vol 7 (1) ◽  
pp. 98-100 ◽  
Author(s):  
Michael J. McDonald ◽  
Tim F. Cooper ◽  
Hubertus J. E. Beaumont ◽  
Paul B. Rainey

Theoretical studies of adaptation emphasize the importance of understanding the distribution of fitness effects (DFE) of new mutations. We report the isolation of 100 adaptive mutants—without the biasing influence of natural selection—from an ancestral genotype whose fitness in the niche occupied by the derived type is extremely low. The fitness of each derived genotype was determined relative to a single reference type and the fitness effects found to conform to a normal distribution. When fitness was measured in a different environment, the rank order changed, but not the shape of the distribution. We argue that, even with detailed knowledge of the genetic architecture underpinning the adaptive types (as is the case here), the DFEs remain unpredictable, and we discuss the possibility that general explanations for the shape of the DFE might not be possible in the absence of organism-specific biological details.


Sign in / Sign up

Export Citation Format

Share Document