infinite alleles model
Recently Published Documents


TOTAL DOCUMENTS

44
(FIVE YEARS 1)

H-INDEX

9
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Marc Manceau

The SARS-CoV-2 outbreak started in late 2019 in the Hubei province in China and the first viral sequence was made available to the scientific community on early January 2020. From there, viral genomes from all over the world have followed at an outstanding rate, reaching already more than 10^5 on early May 2020, and more than 10^6 by early March 2021. Phylodynamics methods have been designed in recent years to process such datasets and infer population dynamics and sampling intensities in the past. However, the unprecedented scale of the SARS-CoV-2 dataset now calls for new methodological developments, relying e.g. on simplifying assumptions of the mutation process. In this article, I build on the infinite alleles model stemming from the field of population genetics to develop a new Bayesian statistical method allowing the joint reconstruction of the outbreak's effective population sizes and sampling intensities through time. This relies on prior conjugacy properties that prove useful both to develop a Gibbs sampler and to gain intuition on the way different parameters of the model are linked and inferred. I finally illustrate the use of this method on SARS-CoV-2 genomes sequenced during the first wave of the outbreak in four distinct European countries, thus offering a new perspective on the evolution of the sampling intensity through time in these countries from genetic data only.


2015 ◽  
Author(s):  
Benjamin D Redelings ◽  
Seiji Kumagai ◽  
Liuyang Wang ◽  
Andrey Tatarenkov ◽  
Ann K. Sakai ◽  
...  

We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about pure hermaphroditism, androdioecy, and gynodioecy. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens Sampling Formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet Process Prior model. Among the parameters jointly inferred are the population-wide rate of self-fertilization, locus-specific mutation rates, and the number of generations since the most recent outcrossing event for each sampled individual.


2015 ◽  
Author(s):  
Jeremy J Berg ◽  
Graham Coop

The use of genetic polymorphism data to understand the dynamics of adaptation and identify the loci that are involved has become a major pursuit of modern evolutionary genetics. In addition to the classical ``hard sweep'' hitchhiking model, recent research has drawn attention to the fact that the dynamics of adaptation can play out in a variety of different ways, and that the specific signatures left behind in population genetic data may depend somewhat strongly on these dynamics. One particular model for which a large number of empirical examples are already known is that in which a single derived mutation arises and drifts to some low frequency before an environmental change causes the allele to become beneficial and sweeps to fixation. Here, we pursue an analytical investigation of this model, bolstered and extended via simulation study. We use coalescent theory to develop an analytical approximation for the effect of a sweep from standing variation on the genealogy at the locus of the selected allele and sites tightly linked to it. We show that the distribution of haplotypes that the selected allele is present on at the time of the environmental change can be approximated by considering recombinant haplotypes as alleles in the infinite alleles model. We show that this approximation can be leveraged to make accurate predictions regarding patterns of genetic polymorphism following such a sweep. We then use simulations to highlight which sources of haplotypic information are likely to be most useful in distinguishing this model from neutrality, as well as from other sweep models, such as the classic hard sweep, and multiple mutation soft sweeps. We find that in general, adaptation from a uniquely derived standing variant will be difficult to detect on the basis of genetic polymorphism data alone, and when it can be detected, it will be difficult to distinguish from other varieties of selective sweeps.


2012 ◽  
Vol 44 (02) ◽  
pp. 408-428 ◽  
Author(s):  
Anand Bhaskar ◽  
John A. Kamm ◽  
Yun S. Song

Many applications in genetic analyses utilize sampling distributions, which describe the probability of observing a sample of DNA sequences randomly drawn from a population. In the one-locus case with special models of mutation, such as the infinite-alleles model or the finite-alleles parent-independent mutation model, closed-form sampling distributions under the coalescent have been known for many decades. However, no exact formula is currently known for more general models of mutation that are of biological interest. In this paper, models with finitely-many alleles are considered, and an urn construction related to the coalescent is used to derive approximate closed-form sampling formulae for an arbitrary irreducible recurrent mutation model or for a reversible recurrent mutation model, depending on whether the number of distinct observed allele types is at most three or four, respectively. It is demonstrated empirically that the formulae derived here are highly accurate when the per-base mutation rate is low, which holds for many biological organisms.


2012 ◽  
Vol 44 (2) ◽  
pp. 408-428 ◽  
Author(s):  
Anand Bhaskar ◽  
John A. Kamm ◽  
Yun S. Song

Many applications in genetic analyses utilize sampling distributions, which describe the probability of observing a sample of DNA sequences randomly drawn from a population. In the one-locus case with special models of mutation, such as the infinite-alleles model or the finite-alleles parent-independent mutation model, closed-form sampling distributions under the coalescent have been known for many decades. However, no exact formula is currently known for more general models of mutation that are of biological interest. In this paper, models with finitely-many alleles are considered, and an urn construction related to the coalescent is used to derive approximate closed-form sampling formulae for an arbitrary irreducible recurrent mutation model or for a reversible recurrent mutation model, depending on whether the number of distinct observed allele types is at most three or four, respectively. It is demonstrated empirically that the formulae derived here are highly accurate when the per-base mutation rate is low, which holds for many biological organisms.


Genetics ◽  
2003 ◽  
Vol 165 (3) ◽  
pp. 1475-1488
Author(s):  
V Vaughan Symonds ◽  
Alan M Lloyd

Abstract Microsatellite loci are among the most commonly used molecular markers. These loci typically exhibit variation for allele frequency distribution within a species. However, the factors contributing to this variation are not well understood. To expand on the current knowledge of microsatellite evolution, 20 microsatellite loci were examined for 126 accessions of the flowering plant, Arabidopsis thaliana. Substantial variability in mutation pattern among loci was found, most of which cannot be explained by the assumptions of the traditional stepwise mutation model or infinite alleles model. Here it is shown that the degree of locus diversity is strongly correlated with the number of contiguous repeats, more so than with the total number of repeats. These findings support a strong role for repeat disruptions in stabilizing microsatellite loci by reducing the substrate for polymerase slippage and recombination. Results of cluster analyses are also presented, demonstrating the potential of microsatellite loci for resolving relationships among accessions of A. thaliana.


2003 ◽  
Vol 35 (03) ◽  
pp. 665-690
Author(s):  
Hilde M. Wilkinson-Herbots

The structured coalescent is a continuous-time Markov chain which describes the genealogy of a sample of homologous genes from a subdivided population. Assuming this model, some results are proved relating to the genealogy of a pair of genes and the extent of subpopulation differentiation, which are valid under certain graph-theoretic symmetry and regularity conditions on the structure of the population. We first review and extend earlier results stating conditions under which the mean time since the most recent common ancestor of a pair of genes from any single subpopulation is independent of the migration rate and equal to that of two genes from an unstructured population of the same total size. Assuming the infinite alleles model of neutral mutation with a small mutation rate, we then prove a simple relationship between the migration rate and the value of Wright's coefficient F ST for a pair of neighbouring subpopulations, which does not depend on the precise structure of the population provided that this is sufficiently symmetric.


2003 ◽  
Vol 35 (3) ◽  
pp. 665-690 ◽  
Author(s):  
Hilde M. Wilkinson-Herbots

The structured coalescent is a continuous-time Markov chain which describes the genealogy of a sample of homologous genes from a subdivided population. Assuming this model, some results are proved relating to the genealogy of a pair of genes and the extent of subpopulation differentiation, which are valid under certain graph-theoretic symmetry and regularity conditions on the structure of the population. We first review and extend earlier results stating conditions under which the mean time since the most recent common ancestor of a pair of genes from any single subpopulation is independent of the migration rate and equal to that of two genes from an unstructured population of the same total size. Assuming the infinite alleles model of neutral mutation with a small mutation rate, we then prove a simple relationship between the migration rate and the value of Wright's coefficient FST for a pair of neighbouring subpopulations, which does not depend on the precise structure of the population provided that this is sufficiently symmetric.


2003 ◽  
Vol 13 (1) ◽  
pp. 181-212 ◽  
Author(s):  
Paul Joyce ◽  
Stephen M. Krone ◽  
Thomas G. Kurtz

Sign in / Sign up

Export Citation Format

Share Document