scholarly journals Going down the rabbit hole: a review on methods characterizing selection and demography in natural populations

2016 ◽  
Author(s):  
Yann Bourgeois ◽  
Khaled Michel Hazzouri ◽  
Ben Warren

1. Characterizing species history and identifying loci underlying local adaptation is crucial in functional ecology, evolutionary biology, conservation and agronomy. The ongoing and constant improvement of next-generation sequencing (NGS) techniques has facilitated the production of an ever-increasing number of genetic markers across genomes of non-model species. 2. The study of variation in these markers across natural populations has deepened the understanding of how population history and selection act on genomes. Population genomics now provides tools to better integrate selection into a historical framework, and take into account selection when reconstructing demographic history. However, this improvement has come with a burst of analytical tools that can confuse users. 3. Such confusion can limit the amount of information effectively retrieved from complex genomic datasets. In addition, the lack of a unified analytical pipeline impairs the diffusion of the most recent analytical tools into fields like conservation biology. 4. To address this need, we describe possible analytical protocols and link these with more than 70 methods dealing with genome-scale datasets. We summarise the strategies they use to infer demographic history and selection, and discuss some of their limitations. A website listing these methods is available at www.methodspopgen.com.

2018 ◽  
Author(s):  
Bhavin S. Khatri ◽  
Austin Burt

Estimating recent effective population size is of great importance in characterising and predicting the evolution of natural populations. Methods based on nucleotide diversity may underestimate current day effective population sizes due to historical bottlenecks, whilst methods that reconstruct demographic history typically only detect long-term variations. However, soft selective sweeps, which leave a fingerprint of mutational history by recurrent mutations on independent haplotype backgrounds, holds promise of an estimate more representative of recent population history. Here we present a simple and robust method of estimation based only on knowledge of the number of independent recurrent origins and the current frequency of the beneficial allele in a population sample, independent of the strength of selection and age of the mutation. Using a forward time theoretical framework, we show the mean number of origins is a function of θ = 2Nμ and current allele frequency, through a simple equation, and the distribution is approximately Poisson. This estimate is robust to whether mutants pre-existed before selection arose, and is equally accurate for diploid populations with incomplete dominance. For fast (e.g., seasonal) demographic changes compared to time scale for fixation of the mutant allele, and for moderate peak-to-trough ratios, we show our constant population size estimate can be used to bound the maximum and minimum population size. Applied to the Vgsc gene of Anopheles gambiae, we estimate an effective population size of roughly 6 × 107, and including seasonal demographic oscillations, a minimum effective population size greater than 6 × 106 and a maximum less than 3 × 109.


2019 ◽  
Vol 36 (9) ◽  
pp. 2040-2052 ◽  
Author(s):  
Bhavin S Khatri ◽  
Austin Burt

Abstract Estimating recent effective population size is of great importance in characterizing and predicting the evolution of natural populations. Methods based on nucleotide diversity may underestimate current day effective population sizes due to historical bottlenecks, whereas methods that reconstruct demographic history typically only detect long-term variations. However, soft selective sweeps, which leave a fingerprint of mutational history by recurrent mutations on independent haplotype backgrounds, holds promise of an estimate more representative of recent population history. Here, we present a simple and robust method of estimation based only on knowledge of the number of independent recurrent origins and the current frequency of the beneficial allele in a population sample, independent of the strength of selection and age of the mutation. Using a forward-time theoretical framework, we show the mean number of origins is a function of θ=2Nμ and current allele frequency, through a simple equation, and the distribution is approximately Poisson. This estimate is robust to whether mutants preexisted before selection arose and is equally accurate for diploid populations with incomplete dominance. For fast (e.g., seasonal) demographic changes compared with time scale for fixation of the mutant allele, and for moderate peak-to-trough ratios, we show our constant population size estimate can be used to bound the maximum and minimum population size. Applied to the Vgsc gene of Anopheles gambiae, we estimate an effective population size of roughly 6×107, and including seasonal demographic oscillations, a minimum effective population size >3×107, and a maximum <6×109, suggesting a mean ∼109.


2018 ◽  
Author(s):  
John A. Kamm ◽  
Jonathan Terhorst ◽  
Richard Durbin ◽  
Yun S. Song

AbstractThe sample frequency spectrum (SFS), or histogram of allele counts, is an important summary statistic in evolutionary biology, and is often used to infer the history of population size changes, migrations, and other demographic events affecting a set of populations. The expected multipopulation SFS under a given demographic model can be efficiently computed when the populations in the model are related by a tree, scaling to hundreds of populations. Admixture, back-migration, and introgression are common natural processes that violate the assumption of a tree-like population history, however, and until now the expected SFS could be computed for only a handful of populations when the demographic history is not a tree. In this article, we present a new method for efficiently computing the expected SFS and linear functionals of it, for demographies described by general directed acyclic graphs. This method can scale to more populations than previously possible for complex demographic histories including admixture. We apply our method to an 8-population SFS to estimate the timing and strength of a proposed “basal Eurasian” admixture event in human history. We implement and release our method in a new open-source software package momi2.


2018 ◽  
Author(s):  
Aaron P. Ragsdale ◽  
Simon Gravel

AbstractWe learn about population history and underlying evolutionary biology through patterns of genetic polymorphism. Many approaches to reconstruct evolutionary histories focus on a limited number of informative statistics describing distributions of allele frequencies or patterns of linkage disequilibrium. We show that many commonly used statistics are part of a broad family of two-locus moments whose expectation can be computed jointly and rapidly under a wide range of scenarios, including complex multi-population demographies with continuous migration and admixture events. A full inspection of these statistics reveals that widely used models of human history fail to predict simple patterns of linkage disequilibrium. To jointly capture the information contained in classical and novel statistics, we implemented a tractable likelihood-based inference framework for demographic history. Using this approach, we show that human evolutionary models that include archaic admixture in Africa, Asia, and Europe provide a much better description of patterns of genetic diversity across the human genome. We estimate that an unidentified, deeply diverged population admixed with modern humans within Africa both before and after the split of African and Eurasian populations, contributing 4-8% genetic ancestry to individuals in world-wide populations.Author SummaryThroughout human history, populations have expanded and contracted, split and merged, and ex-changed migrants. Because these events affected genetic diversity, we can learn about human history by comparing predictions from evolutionary models to genetic data. Here, we show how to rapidly compute such predictions for a wide range of diversity measures within and across populations under complex demographic scenarios. While widely used models of human history accurately predict common measures of diversity, we show that they strongly underestimate the co-occurence of low frequency mutations within human populations in Asia, Europe, and Africa. Models allowing for archaic admixture, the relatively recent mixing of human populations with deeply diverged human lineages, resolve this discrepancy. We use such models to infer demographic models that include both recent and ancient features of human history. We recover the well-characterized admixture of Neanderthals in Eurasian populations, as well as admixture from an as-yet unknown diverged human population within Africa, further suggesting that admixture with deeply diverged lineages occurred multiple times in human history. By simultaneously testing model predictions for a broad range of diversity statistics, we can assess the robustness of common evolutionary models, identify missing historical events, and build more informed models of human demography.


mBio ◽  
2018 ◽  
Vol 9 (3) ◽  
pp. e00381-18 ◽  
Author(s):  
Ousmane H. Cissé ◽  
Liang Ma ◽  
Da Wei Huang ◽  
Pavel P. Khil ◽  
John P. Dekker ◽  
...  

ABSTRACTPneumocystisspecies are opportunistic mammalian pathogens that cause severe pneumonia in immunocompromised individuals. These fungi are highly host specific and uncultivablein vitro. HumanPneumocystisinfections present major challenges because of a limited therapeutic arsenal and the rise of drug resistance. To investigate the diversity and demographic history of natural populations ofPneumocystisinfecting humans, rats, and mice, we performed whole-genome and large-scale multilocus sequencing of infected tissues collected in various geographic locations. Here, we detected reduced levels of recombination and variations in historical demography, which shape the global population structures. We report estimates of evolutionary rates, levels of genetic diversity, and population sizes. Molecular clock estimates indicate thatPneumocystisspecies diverged before their hosts, while the asynchronous timing of population declines suggests host shifts. Our results have uncovered complex patterns of genetic variation influenced by multiple factors that shaped the adaptation ofPneumocystispopulations during their spread across mammals.IMPORTANCEUnderstanding how natural pathogen populations evolve and identifying the determinants of genetic variation are central issues in evolutionary biology.Pneumocystis, a fungal pathogen which infects mammals exclusively, provides opportunities to explore these issues. In humans,Pneumocystiscan cause a life-threatening pneumonia in immunosuppressed individuals. In analysis of differentPneumocystisspecies infecting humans, rats, and mice, we found that there are high infection rates and that natural populations maintain a high level of genetic variation despite low levels of recombination. We found no evidence of population structuring by geography. Our comparisons of the times of divergence of these species to their respective hosts suggest thatPneumocystismay have undergone recent host shifts. The results demonstrate thatPneumocystisstrains are widely disseminated geographically and provide a new understanding of the evolution of these pathogens.


Forests ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 1164
Author(s):  
Xiao-Dan Chen ◽  
Xiao Zhang ◽  
Hao Zhang ◽  
Tao Zhou ◽  
Yue-Mei Zhao ◽  
...  

Knowledge of interspecific divergence and population expansions/contractions of dominant forest trees in response to geological events and climatic oscillations is of major importance to understand their evolution and demography. However, the interspecific patterns of genetic differentiation and spatiotemporal population dynamics of three deciduous Cerris oak species (Q. acutissima, Q. variabilis and Q. chenii) that are widely distributed in China remain poorly understood. In this study, we genotyped 16 nuclear loci in 759 individuals sampled from 44 natural populations of these three sibling species to evaluate the plausible demographical scenarios of the closely related species. We also tested the hypothesis that macro- and microevolutionary processes of the three species had been triggered and molded by Miocene–Pliocene geological events and Quaternary climatic change. The Bayesian cluster analysis showed that Q. acutissima and Q. chenii were clustered in the same group, whereas Q. variabilis formed a different genetic cluster. An approximate Bayesian computation (ABC) analyses suggested that Q. variabilis and Q. acutissima diverged from their most common ancestor around 19.84 Ma, and subsequently Q. chenii diverged from Q. acutissima at about 9.6 Ma, which was significantly associated with the episodes of the Qinghai–Tibetan Plateau (QTP). In addition, ecological niche modeling and population history analysis showed that these three Cerris oak species repeatedly underwent considerable ‘expansion–contraction’ during the interglacial and glacial periods of the Pleistocene, although they have varying degrees of tolerance for the climatic change. Overall, these findings indicated geological and climatic changes during the Miocene–Pliocene and Pleistocene as causes of species divergence and range shifts of dominant tree species in the subtropical and warm temperature areas in China.


2021 ◽  
Vol 9 ◽  
Author(s):  
Xun Xu ◽  
Bao-Sheng Wang ◽  
Hui Yu

Understanding how intraspecies divergence results in speciation has great importance for our knowledge of evolutionary biology. Here we applied population genomics approaches to a fig wasp species (Valisia javana complex sp 1) to reveal its intraspecies differentiation and the underlying evolutionary dynamics. With re-sequencing data, we prove the Hainan Island population (DA) of sp1 genetically differ from the continental ones, then reveal the differed divergence pattern. DA has reduced SNP diversity but a higher proportion of population-specific structural variations (SVs), implying a restricted gene exchange. Based on SNPs, 32 differentiated islands containing 204 genes were detected, along with 1,532 population-specific SVs of DA overlapping 4,141 genes. The gene ontology (GO) enrichment analysis performed on differentiated islands linked to three significant GO terms on a basic metabolism process, with most of the genes failing to enrich. In contrast, population-specific SVs contributed more to the adaptation than the SNPs by linking to 59 terms that are crucial for wasp speciation, such as host reorganization and development regulation. In addition, the generalized dissimilarity modeling confirms the importance of environment difference on the genetic divergence within sp1. Hence, we assume the genetic divergence between DA and the continent due to not only the strait as a geographic barrier, but also adaptation. We reconstruct the demographic history within sp1. DA shares a similar population history with the nearby continental population, suggesting an incomplete divergence. Summarily, our results reveal how geographic barriers and adaptation both influence the genetic divergence at population-level, thereby increasing our knowledge on the potential speciation of non-model organisms.


Genetics ◽  
2000 ◽  
Vol 155 (3) ◽  
pp. 1429-1437
Author(s):  
Oliver G Pybus ◽  
Andrew Rambaut ◽  
Paul H Harvey

Abstract We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 258
Author(s):  
Karim Karimi ◽  
Duy Ngoc Do ◽  
Mehdi Sargolzaei ◽  
Younes Miar

Characterizing the genetic structure and population history can facilitate the development of genomic breeding strategies for the American mink. In this study, we used the whole genome sequences of 100 mink from the Canadian Centre for Fur Animal Research (CCFAR) at the Dalhousie Faculty of Agriculture (Truro, NS, Canada) and Millbank Fur Farm (Rockwood, ON, Canada) to investigate their population structure, genetic diversity and linkage disequilibrium (LD) patterns. Analysis of molecular variance (AMOVA) indicated that the variation among color-types was significant (p < 0.001) and accounted for 18% of the total variation. The admixture analysis revealed that assuming three ancestral populations (K = 3) provided the lowest cross-validation error (0.49). The effective population size (Ne) at five generations ago was estimated to be 99 and 50 for CCFAR and Millbank Fur Farm, respectively. The LD patterns revealed that the average r2 reduced to <0.2 at genomic distances of >20 kb and >100 kb in CCFAR and Millbank Fur Farm suggesting that the density of 120,000 and 24,000 single nucleotide polymorphisms (SNP) would provide the adequate accuracy of genomic evaluation in these populations, respectively. These results indicated that accounting for admixture is critical for designing the SNP panels for genotype-phenotype association studies of American mink.


2010 ◽  
Vol 60 (4) ◽  
pp. 449-465
Author(s):  
Wen Longying ◽  
Zhang Lixun ◽  
An Bei ◽  
Luo Huaxing ◽  
Liu Naifa ◽  
...  

AbstractWe have used phylogeographic methods to investigate the genetic structure and population history of the endangered Himalayan snowcock (Tetraogallus himalayensis) in northwestern China. The mitochondrial cytochrome b gene was sequenced of 102 individuals sampled throughout the distribution range. In total, we found 26 different haplotypes defined by 28 polymorphic sites. Phylogenetic analyses indicated that the samples were divided into two major haplogroups corresponding to one western and one eastern clade. The divergence time between these major clades was estimated to be approximately one million years. An analysis of molecular variance showed that 40% of the total genetic variability was found within local populations, 12% among populations within regional groups and 48% among groups. An analysis of the demographic history of the populations suggested that major expansions have occurred in the Himalayan snowcock populations and these correlate mainly with the first and the second largest glaciations during the Pleistocene. In addition, the data indicate that there was a population expansion of the Tianshan population during the uplift of the Qinghai-Tibet Plateau, approximately 2 million years ago.


Sign in / Sign up

Export Citation Format

Share Document