scholarly journals A phylogenetic estimator of effective population size or mutation rate.

Genetics ◽  
1994 ◽  
Vol 136 (2) ◽  
pp. 685-692 ◽  
Author(s):  
Y X Fu

Abstract A new estimator of the essential parameter theta = 4Ne mu from DNA polymorphism data is developed under the neutral Wright-Fisher model without recombination and population subdivision, where Ne is the effective population size and mu is the mutation rate per locus per generation. The new estimator has a variance only slightly larger than the minimum variance of all possible unbiased estimators of the parameter and is substantially smaller than that of any existing estimator. The high efficiency of the new estimator is achieved by making full use of phylogenetic information in a sample of DNA sequences from a population. An example of estimating theta by the new method is presented using the mitochondrial sequences from an American Indian population.

Genetics ◽  
1996 ◽  
Vol 144 (3) ◽  
pp. 1271-1281
Author(s):  
Hong-Wen Deng ◽  
Yun-Xin Fu

Abstract Multiple hits at some sites of human mitochondrial DNA sequences suggest that the commonly assumed infinite-sites model can be violated. Under the neutral Wright-Fisher model without recombination and population subdivision, we investigated, by computer simulations, the effect of multiple hits on the estimation of the essential parameter θ = 4Neμ by Fu's UPBLUE procedure. We found that with moderate mutation rate heterogeneity, UPBLUE performs very well in terms of unbiasness and efficiency. Under extreme mutation rate heterogeneity, if sample size is reasonably large (e.g., >60), UPBLUE is still very satisfactory; otherwise we developed a new correction equation. Given knowledge of the degree of mutation rate heterogeneity, the performance of UPBLUE with the new correction equation was tested to be fairly satisfactory: there is almost no bias and the sampling variance is only slightly higher than the theoretical minimum variance. Thus, with an appropriate correction, UPBLUE is relatively robust to the multiple hits. In genealogies reconstructed by UPGMA, we found that the total length of branches directly linked to the tips is underestimated, and those far away tend to be overestimated, while the total length of all branches is not biased.


Genetics ◽  
1997 ◽  
Vol 146 (4) ◽  
pp. 1489-1499 ◽  
Author(s):  
Yun-Xin Fu

A coalescent theory for a sample of DNA sequences from a partially selfing diploid population and an algorithm for simulating such samples are developed in this article. Approximate formulas are given for the expectation and the variance of the number of segregating sites in a sample of k sequences from n individuals. Several new estimators of the important parameters θ = 4Nμ and the selfing rate s, where N and μ are, respectively, the effective population size and the mutation rate per sequence per generation, are proposed and their sampling properties are studied.


Genetics ◽  
1997 ◽  
Vol 145 (3) ◽  
pp. 833-846 ◽  
Author(s):  
Jody Hey ◽  
John Wakeley

Population genetic models often use a population recombination parameter 4Nc, where N is the effective population size and c is the recombination rate per generation. In many ways 4Nc is comparable to 4Nu, the population mutation rate. Both combine genome level and population level processes, and together they describe the rate of production of genetic variation in a population. However, 4Nc is more difficult to estimate. For a population sample of DNA sequences, historical recombination can only be detected if polymorphisms exist, and even then most recombination events are not detectable. This paper describes an estimator of 4Nc, hereafter designated γ (gamma), that was developed using a coalescent model for a sample of four DNA sequences with recombination. The reliability of γ was assessed using multiple coalescent simulations. In general γ has low to moderate bias, and the reliability of γ is comparable, though less, than that for a widely used estimator of 4Nu. If there exists an independent estimate of the recombination rate (per generation, per base pair), γ can be used to estimate the effective population size or the neutral mutation rate.


2010 ◽  
Vol 107 (5) ◽  
pp. 2147-2152 ◽  
Author(s):  
Chad D. Huff ◽  
Jinchuan Xing ◽  
Alan R. Rogers ◽  
David Witherspoon ◽  
Lynn B. Jorde

The genealogies of different genetic loci vary in depth. The deeper the genealogy, the greater the chance that it will include a rare event, such as the insertion of a mobile element. Therefore, the genealogy of a region that contains a mobile element is on average older than that of the rest of the genome. In a simple demographic model, the expected time to most recent common ancestor (TMRCA) is doubled if a rare insertion is present. We test this expectation by examining single nucleotide polymorphisms around polymorphic Alu insertions from two completely sequenced human genomes. The estimated TMRCA for regions containing a polymorphic insertion is two times larger than the genomic average (P < <10−30), as predicted. Because genealogies that contain polymorphic mobile elements are old, they are shaped largely by the forces of ancient population history and are insensitive to recent demographic events, such as bottlenecks and expansions. Remarkably, the information in just two human DNA sequences provides substantial information about ancient human population size. By comparing the likelihood of various demographic models, we estimate that the effective population size of human ancestors living before 1.2 million years ago was 18,500, and we can reject all models where the ancient effective population size was larger than 26,000. This result implies an unusually small population for a species spread across the entire Old World, particularly in light of the effective population sizes of chimpanzees (21,000) and gorillas (25,000), which each inhabit only one part of a single continent.


Author(s):  
Bruce Walsh ◽  
Michael Lynch

The effects of genetic drift usually assume an idealized population of constant size. This chapter shows how the population size for such an idealized population can be replaced with an effective population size for populations with age structure, unequal sex ratios, a history of expansion or contraction, inbreeding, and population subdivision. These demographic features impact the entire genome more or less equally. A relatively recent understanding is that selection at a site can dramatically reduce the local effective population size experienced by nearby linked sites (the Hill-Robertson effect). This can arise from background selection to remove deleterious new mutations or from selective sweeps wherein favorable new mutations are driven toward fixation. The Hill-Robertson effect is a general way to describe the fact that selection at a site makes selection are other linked sites less efficient, and, therefore, more neutral. This chapter discusses the implications of this finding for genome structure.


Genetics ◽  
1977 ◽  
Vol 85 (2) ◽  
pp. 331-337
Author(s):  
Wen-Hsiung Li

ABSTRACT Watterson's (1975) formula for the steady-state distribution of the number of nucleotide differences between two randomly chosen cistrons in a finite population has been extended to transient states. The rate for the mean of this distribution to approach its equilibrium value is 1/2 N and independent of mutation rate, but that for the variance is dependent on mutation rate, where N denotes the effective population size. Numerical computations show that if the heterozygosity (i.e., the probability that two cistrons are different) is low, say of the order of 0.1 or less, the probability that two cistrons differ at two or more nucleotide sites is less than 10 percent of the heterozygosity, whereas this probability may be as high as 50 percent of the heterozygosity if the heterozygosity is 0.5. A simple estimate for the mean number (d) of site differences between cistrons is d = h/(1 - h) where h is the heterozygosity. At equilibrium, the probability that two cistrons differ by more than one site is equal to h  2, the square of heterozygosity.


Genetics ◽  
1994 ◽  
Vol 138 (4) ◽  
pp. 1375-1386 ◽  
Author(s):  
Y X Fu

Abstract Mutations resulting in segregating sites of a sample of DNA sequences can be classified by size and type and the frequencies of mutations of different sizes and types can be inferred from the sample. A framework for estimating the essential parameter theta = 4Nu utilizing the frequencies of mutations of various sizes and types is developed in this paper, where N is the effective size of a population and mu is mutation rate per sequence per generation. The framework is a combination of coalescent theory, general linear model and Monte-Carlo integration, which leads to two new estimators theta xi and theta eta as well as a general Watterson's estimator theta K and a general Tajima's estimator theta tau. The greatest strength of the framework is that it can be used under a variety of population models. The properties of the framework and the four estimators theta K, theta tau, theta xi and theta eta are investigated under three important population models: the neutral Wright-Fisher model, the neutral model with recombination and the neutral Wright's finite-islands model. Under all these models, it is shown that theta xi is the best estimator among the four even when recombination rate or migration rate has to be estimated. Under the neutral Wright-Fisher model, it is shown that the new estimator theta xi has a variance close to a lower bound of variances of all unbiased estimators of theta which suggests that theta xi is a very efficient estimator.


Sign in / Sign up

Export Citation Format

Share Document