scholarly journals Inferring population size history from large samples of genome wide molecular data - an approximate Bayesian computation approach

2016 ◽  
Author(s):  
Simon Boitard ◽  
Willy Rodriguez ◽  
Flora Jay ◽  
Stefano Mona ◽  
Frederic Austeritz

Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.

Author(s):  
Théophile Sanchez ◽  
Jean Cury ◽  
Guillaume Charpiat ◽  
Flora Jay

AbstractFor the past decades, simulation-based likelihood-free inference methods have enabled researchers to address numerous population genetics problems. As the richness and amount of simulated and real genetic data keep increasing, the field has a strong opportunity to tackle tasks that current methods hardly solve. However, high data dimensionality forces most methods to summarize large genomic datasets into a relatively small number of handcrafted features (summary statistics). Here we propose an alternative to summary statistics, based on the automatic extraction of relevant information using deep learning techniques. Specifically, we design artificial neural networks (ANNs) that take as input single nucleotide polymorphic sites (SNPs) found in individuals sampled from a single population and infer the past effective population size history. First, we provide guidelines to construct artificial neural networks that comply with the intrinsic properties of SNP data such as invariance to permutation of haplotypes, long scale interactions between SNPs and variable genomic length. Thanks to a Bayesian hyperparameter optimization procedure, we evaluate the performance of multiple networks and compare them to well established methods like Approximate Bayesian Computation (ABC). Even without the expert knowledge of summary statistics, our approach compares fairly well to an ABC based on handcrafted features. Furthermore we show that combining deep learning and ABC can improve performance while taking advantage of both frameworks. Finally, we apply our approach to reconstruct the effective population size history of cattle breed populations.


1984 ◽  
Vol 44 (3) ◽  
pp. 321-341 ◽  
Author(s):  
P. J. Avery

SUMMARYFrom the available electrophoretic data, it is clear that haplodiploid insects have a much lower level of genetic variability than diploid insects, a difference that is only partially explained by the social structure of some haplodiploid species. The data comparing X-linked genes and autosomal genes in the same species is much more sparse and little can be inferred from it. This data is compared with theoretical analyses of X-linked genes and genes in haplodiploids. (The theoretical population genetics of X-linked genes and genes in haplodiploids are identical.) X-linked genes under directional selection will be lost or fixed more quickly than autosomal genes as selection acts more directly on X-linked genes and the effective population size is smaller. However, deleterious disease genes, maintained by mutation pressure, will give higher disease incidences at X-linked loci and hence rare mutants are easier to detect at X-linked loci. Considering the forces which can maintain balanced polymorphisms, there are much stronger restrictions on the fitness parameters at X-linked loci than at autosomal loci if genetic variability is to be maintained, and thus fewer polymorphic loci are to be expected on the X-chromosome and in haplodiploids. However, the mutation-random drift hypothesis also leads to the expectation of lower heterozygosity due to the decrease in effective population size. Thus the theoretical results fit in with the data but it is still subject to argument whether selection or mutation-random drift are maintaining most of the genetic variability at X-linked genes and genes in haplodiploids.


2010 ◽  
Vol 107 (5) ◽  
pp. 2147-2152 ◽  
Author(s):  
Chad D. Huff ◽  
Jinchuan Xing ◽  
Alan R. Rogers ◽  
David Witherspoon ◽  
Lynn B. Jorde

The genealogies of different genetic loci vary in depth. The deeper the genealogy, the greater the chance that it will include a rare event, such as the insertion of a mobile element. Therefore, the genealogy of a region that contains a mobile element is on average older than that of the rest of the genome. In a simple demographic model, the expected time to most recent common ancestor (TMRCA) is doubled if a rare insertion is present. We test this expectation by examining single nucleotide polymorphisms around polymorphic Alu insertions from two completely sequenced human genomes. The estimated TMRCA for regions containing a polymorphic insertion is two times larger than the genomic average (P < <10−30), as predicted. Because genealogies that contain polymorphic mobile elements are old, they are shaped largely by the forces of ancient population history and are insensitive to recent demographic events, such as bottlenecks and expansions. Remarkably, the information in just two human DNA sequences provides substantial information about ancient human population size. By comparing the likelihood of various demographic models, we estimate that the effective population size of human ancestors living before 1.2 million years ago was 18,500, and we can reject all models where the ancient effective population size was larger than 26,000. This result implies an unusually small population for a species spread across the entire Old World, particularly in light of the effective population sizes of chimpanzees (21,000) and gorillas (25,000), which each inhabit only one part of a single continent.


2018 ◽  
Author(s):  
Ariella L. Gladstein ◽  
Michael F. Hammer

The Ashkenazi Jews (AJ) are a population isolate that have resided in Central Europe since at least the 10th century and share ancestry with both European and Middle Eastern populations. Between the 11th and 16th centuries, AJ expanded eastward leading to two culturally distinct communities, one in central Europe and one in eastern Europe. Our aim was to determine if there are genetically distinct AJ subpopulations that reflect the cultural groups, and if so, what demographic events contributed to the population differentiation. We used Approximate Bayesian Computation (ABC) to choose among models of AJ history and infer demographic parameter values, including divergence times, effective population size, and gene flow. For the ABC analysis we used allele frequency spectrum and identical by descent based statistics to capture information on a wide timescale. We also mitigated the effects of ascertainment bias when performing ABC on SNP array data by jointly modeling and inferring the SNP discovery. We found that the most likely model was population differentiation between the Eastern and Western AJ ~400 years ago. The differentiation between the Eastern and Western AJ could be attributed to more extreme population growth in the Eastern AJ (0.25 per generation) than the Western AJ (0.069 per generation).


2021 ◽  
Author(s):  
Dominik Deffner ◽  
Anne Kandler ◽  
Laurel Fogarty

ABSTRACTPopulation size has long been considered an important driver of cultural diversity and complexity. Results from population genetics, however, demonstrate that in populations with complex demographic structure or mode of inheritance, it is not the census population size, N, but the effective size of a population, Ne, that determines important evolutionary parameters. Here, we examine the concept of effective population size for traits that evolve culturally, through processes of innovation and social learning. We use mathematical and computational modeling approaches to investigate how cultural Ne and levels of diversity depend on (1) the way traits are learned, (2) population connectedness, and (3) social network structure. We show that one-to-many and frequency-dependent transmission can temporally or permanently lower effective population size compared to census numbers. We caution that migration and cultural exchange can have counter-intuitive effects on Ne. Network density in random networks leaves Ne unchanged, scale-free networks tend to decrease and small-world networks tend to increase Ne compared to census numbers. For one-to-many transmission and different network structures, effective size and cultural diversity are closely associated. For connectedness, however, even small amounts of migration and cultural exchange result in high diversity independently of Ne. Our results highlight the importance of carefully defining effective population size for cultural systems and show that inferring Ne requires detailed knowledge about underlying cultural and demographic processes.AUTHOR SUMMARYHuman populations show immense cultural diversity and researchers have regarded population size as an important driver of cultural variation and complexity. Our approach is based on cultural evolutionary theory which applies ideas about evolution to understand how cultural traits change over time. We employ insights from population genetics about the “effective” size of a population (i.e. the size that matters for important evolutionary outcomes) to understand how and when larger populations can be expected to be more culturally diverse. Specifically, we provide a formal derivation for cultural effective population size and use mathematical and computational models to study how effective size and cultural diversity depend on (1) the way culture is transmitted, (2) levels of migration and cultural exchange, as well as (3) social network structure. Our results highlight the importance of effective sizes for cultural evolution and provide heuristics for empirical researchers to decide when census numbers could be used as proxies for the theoretically relevant effective numbers and when they should not.


Author(s):  
Hsuan Jung ◽  
Paul Marjoram

In this paper, we develop a Genetic Algorithm that can address the fundamental problem of how one should weight the summary statistics included in an approximate Bayesian computation analysis built around an accept/reject algorithm, and how one might choose the tolerance for that analysis. We then demonstrate that using weighted statistics, and a well-chosen tolerance, in such an approximate Bayesian computation approach can result in improved performance, when compared to unweighted analyses, using one example drawn purely from statistics and two drawn from the estimation of population genetics parameters.


PLoS Genetics ◽  
2016 ◽  
Vol 12 (3) ◽  
pp. e1005877 ◽  
Author(s):  
Simon Boitard ◽  
Willy Rodríguez ◽  
Flora Jay ◽  
Stefano Mona ◽  
Frédéric Austerlitz

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3530 ◽  
Author(s):  
Miguel Navascués ◽  
Raphaël Leblois ◽  
Concetta Burgarella

The skyline plot is a graphical representation of historical effective population sizes as a function of time. Past population sizes for these plots are estimated from genetic data, without a priori assumptions on the mathematical function defining the shape of the demographic trajectory. Because of this flexibility in shape, skyline plots can, in principle, provide realistic descriptions of the complex demographic scenarios that occur in natural populations. Currently, demographic estimates needed for skyline plots are estimated using coalescent samplers or a composite likelihood approach. Here, we provide a way to estimate historical effective population sizes using an Approximate Bayesian Computation (ABC) framework. We assess its performance using simulated and actual microsatellite datasets. Our method correctly retrieves the signal of contracting, constant and expanding populations, although the graphical shape of the plot is not always an accurate representation of the true demographic trajectory, particularly for recent changes in size and contracting populations. Because of the flexibility of ABC, similar approaches can be extended to other types of data, to multiple populations, or to other parameters that can change through time, such as the migration rate.


Sign in / Sign up

Export Citation Format

Share Document