A Spatial Framework for Understanding Population Structure and Admixture.

Mapping Intimacies ◽

10.1101/013474 ◽

2015 ◽

Cited By ~ 2

Author(s):

Gideon Bradburd ◽

Peter L. Ralph ◽

Graham Coop

Keyword(s):

Population Structure ◽

Gene Flow ◽

Human Populations ◽

Geographic Proximity ◽

Genetic Covariance ◽

Geographic Patterns ◽

Genome Wide ◽

Limited Dispersal ◽

Phylloscopus Trochiloides ◽

Ring Species

Geographic patterns of genetic variation within modern populations, produced by complex histories of migration, can be difficult to infer and visually summarize. A general consequence of geographically limited dispersal is that samples from nearby locations tend to be more closely related than samples from distant locations, and so genetic covariance often recapitulates geographic proximity. We use genome-wide polymorphism data to build ``geogenetic maps,'' which, when applied to stationary populations, produces a map of the geographic positions of the populations, but with distances distorted to reflect historical rates of gene flow. In the underlying model, allele frequency covariance is a decreasing function of geogenetic distance, and nonlocal gene flow such as admixture can be identified as anomalously strong covariance over long distances. This admixture is explicitly co-estimated and depicted as arrows, from the source of admixture to the recipient, on the geogenetic map. We demonstrate the utility of this method on a circum-Tibetan sampling of the greenish warbler (Phylloscopus trochiloides), in which we find evidence for gene flow between the adjacent, terminal populations of the ring species. We also analyze a global sampling of human populations, for which we largely recover the geography of the sampling, with support for significant histories of admixture in many samples. This new tool for understanding and visualizing patterns of population structure is implemented in a Bayesian framework in the program SpaceMix.

Download Full-text

A differential expression of pyrethroid resistance genes in the malaria vector Anopheles funestus across Uganda is associated with patterns of gene flow

PLoS ONE ◽

10.1371/journal.pone.0240743 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0240743

Author(s):

Maurice Marcel Sandeu ◽

Charles Mulamba ◽

Gareth D. Weedall ◽

Charles S. Wondji

Keyword(s):

Population Structure ◽

Gene Flow ◽

Genetic Structure ◽

Genetic Differentiation ◽

Insecticide Resistance ◽

Pyrethroid Resistance ◽

Anopheles Funestus ◽

Metabolic Resistance ◽

Transcription Analysis ◽

Genome Wide

Background Insecticide resistance is challenging the effectiveness of insecticide-based control interventions to reduce malaria burden in Africa. Understanding the molecular basis of insecticides resistance and patterns of gene flow in major malaria vectors such as Anopheles funestus are important steps for designing effective resistance management strategies. Here, we investigated the association between patterns of genetic structure and expression profiles of genes involved in the pyrethroid resistance in An. funestus across Uganda and neighboring Kenya. Methods Blood-fed mosquitoes An. funestus were collected across the four localities in Uganda and neighboring Kenya. A Microarray-based genome-wide transcription analysis was performed to identify the set of genes associated with permethrin resistance. 17 microsatellites markers were genotyped and used to establish patterns of genetic differentiation. Results Microarray-based genome-wide transcription profiling of pyrethroid resistance in four locations across Uganda (Arua, Bulambuli, Lira, and Tororo) and Kenya (Kisumu) revealed that resistance was mainly driven by metabolic resistance. The most commonly up-regulated genes in pyrethroid resistance mosquitoes include cytochrome P450s (CYP9K1, CYP6M7, CYP4H18, CYP4H17, CYP4C36). However, expression levels of key genes vary geographically such as the P450 CYP6M7 [Fold-change (FC) = 115.8 (Arua) vs 24.05 (Tororo) and 16.9 (Kisumu)]. In addition, several genes from other families were also over-expressed including Glutathione S-transferases (GSTs), carboxylesterases, trypsin, glycogenin, and nucleotide binding protein which probably contribute to insecticide resistance across Uganda and Kenya. Genotyping of 17 microsatellite loci in the five locations provided evidence that a geographical shift in the resistance mechanisms could be associated with patterns of population structure throughout East Africa. Genetic and population structure analyses indicated significant genetic differentiation between Arua and other localities (FST>0.03) and revealed a barrier to gene flow between Arua and other areas, possibly associated with Rift Valley. Conclusion The correlation between patterns of genetic structure and variation in gene expression could be used to inform future interventions especially as new insecticides are gradually introduced.

Download Full-text

Signals of polygenic adaptation on height have been overestimated due to uncorrected population structure in genome-wide association studies

10.1101/355057 ◽

2018 ◽

Cited By ~ 19

Author(s):

Mashaal Sohail ◽

Robert M. Maier ◽

Andrea Ganna ◽

Alex Bloemendal ◽

Alicia R. Martin ◽

...

Keyword(s):

Population Structure ◽

Association Studies ◽

Meta Analysis ◽

Human Populations ◽

Genome Wide Association Studies ◽

Multiple Traits ◽

Large Numbers ◽

Genome Wide ◽

Polygenic Adaptation ◽

The Uk

AbstractGenetic predictions of height differ among human populations and these differences are too large to be explained by genetic drift. This observation has been interpreted as evidence of polygenic adaptation. Differences across populations were detected using SNPs genome-wide significantly associated with height, and many studies also found that the signals grew stronger when large numbers of subsignificant SNPs were analyzed. This has led to excitement about the prospect of analyzing large fractions of the genome to detect subtle signals of selection and claims of polygenic adaptation for multiple traits. Polygenic adaptation studies of height have been based on SNP effect size measurements in the GIANT Consortium meta-analysis. Here we repeat the height analyses in the UK Biobank, a much more homogeneously designed study. Our results show that polygenic adaptation signals based on large numbers of SNPs below genome-wide significance are extremely sensitive to biases due to uncorrected population structure.

Download Full-text

Tracking human population structure through time from whole genome sequences

10.1101/585265 ◽

2019 ◽

Cited By ~ 4

Author(s):

Ke Wang ◽

Iain Mathieson ◽

Jared O’Connell ◽

Stephan Schiffels

Keyword(s):

Population Structure ◽

Gene Flow ◽

Demographic History ◽

Time Dependent ◽

Human Populations ◽

Whole Genome ◽

Genome Sequences ◽

Migration Model ◽

Novel Approach ◽

Different Populations

AbstractThe genetic diversity of humans, like many species, has been shaped by a complex pattern of population separations followed by isolation and subsequent admixture. This pattern, reaching at least as far back as the appearance of our species in the paleontological record, has left its traces in our genomes. Reconstructing a population’s history from these traces is a challenging problem. Here we present a novel approach based on the Multiple Sequentially Markovian Coalescent (MSMC) to analyse the population separation history. Our approach, called MSMC-IM, uses an improved implementation of the MSMC (MSMC2) to estimate coalescence rates within and across pairs of populations, and then fits a continuous Isolation-Migration model to these rates to obtain a time-dependent estimate of gene flow. We show, using simulations, that our method can identify complex demographic scenarios involving post-split admixture or archaic introgression. We apply MSMC-IM to whole genome sequences from 15 worldwide populations, tracking the process of human genetic diversification. We detect traces of extremely deep ancestry between some African populations, with around 1% of ancestry dating to divergences older than a million years ago.Author SummaryHuman demographic history is reflected in specific patterns of shared mutations between the genomes from different populations. Here we aim to unravel this pattern to infer population structure through time with a new approach, called MSMC-IM. Based on estimates of coalescence rates within and across populations, MSMC-IM fits a time-dependent migration model to the pairwise rate of coalescences. We implemented this approach as an extension to existing software (MSMC2), and tested it with simulations exhibiting different histories of admixture and gene flow. We then applied it to the genomes from 15 worldwide populations to reveal their pairwise separation history ranging from a few thousand up to several million years ago. Among other results, we find evidence for remarkably deep population structure in some African population pairs, suggesting that deep ancestry dating to one million years ago and older is still present in human populations in small amounts today.

Download Full-text

Gene Flow and Natural Selection in Oceanic Human Populations Inferred from Genome-Wide SNP Typing

Molecular Biology and Evolution ◽

10.1093/molbev/msn128 ◽

2008 ◽

Vol 25 (8) ◽

pp. 1750-1761 ◽

Cited By ~ 35

Author(s):

R. Kimura ◽

J. Ohashi ◽

Y. Matsumura ◽

M. Nakazawa ◽

T. Inaoka ◽

...

Keyword(s):

Gene Flow ◽

Natural Selection ◽

Human Populations ◽

Snp Typing ◽

Genome Wide

Download Full-text

Analysis of population structure and genetic diversity reveals gene flow and geographic patterns in cultivated rice (O. sativa and O. glaberrima) in West Africa

Euphytica ◽

10.1007/s10681-018-2285-1 ◽

2018 ◽

Vol 214 (11) ◽

Cited By ~ 4

Author(s):

Octaviano Igor Yelome ◽

Kris Audenaert ◽

Sofie Landschoot ◽

Alexandre Dansi ◽

Wouter Vanhove ◽

...

Keyword(s):

Genetic Diversity ◽

Population Structure ◽

Gene Flow ◽

West Africa ◽

Cultivated Rice ◽

Geographic Patterns

Download Full-text

The counteracting effects of demography on functional genomic variation: the Roma paradigm

Molecular Biology and Evolution ◽

10.1093/molbev/msab070 ◽

2021 ◽

Author(s):

Neus Font-Porterias ◽

Rocio Caro-Consuegra ◽

Marcel Lucas-Sánchez ◽

Marie Lopez ◽

Aaron Giménez ◽

...

Keyword(s):

Gene Flow ◽

Rare Variants ◽

Demographic History ◽

Genomic Variation ◽

Human Populations ◽

History Plays ◽

High Coverage ◽

Roma Population ◽

Genome Wide ◽

Whole Exome

Abstract Demographic history plays a major role in shaping the distribution of genomic variation. Yet the interaction between different demographic forces and their effects in the genomes is not fully resolved in human populations. Here we focus on the Roma population, the largest transnational ethnic minority in Europe. They have a South Asian origin and their demographic history is characterized by recent dispersals, multiple founder events and extensive gene flow from non-Roma groups. Through the analyses of new high-coverage whole exome sequences and genome-wide array data for 89 Iberian Roma individuals together with forward simulations, we show that founder effects have reduced their genetic diversity and proportion of rare variants, gene flow has counteracted the increase in mutational load, runs of homozygosity show ancestry-specific patterns of accumulation of deleterious homozygotes, and selection signals primarily derive from pre-admixture adaptation in the Roma population sources. The present study shows how two demographic forces, bottlenecks and admixture, act in opposite directions and have long-term balancing effects on the Roma genomes. Understanding how demography and gene flow shape the genome of an admixed population provides an opportunity to elucidate how genomic variation is modelled in human populations.

Download Full-text

False discovery rate control in genome-wide association studies with population structure

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2105841118 ◽

2021 ◽

Vol 118 (40) ◽

pp. e2105841118

Author(s):

Matteo Sesia ◽

Stephen Bates ◽

Emmanuel Candès ◽

Jonathan Marchini ◽

Chiara Sabatti

Keyword(s):

Population Structure ◽

False Discovery Rate ◽

Association Studies ◽

Genome Wide Association ◽

Human Populations ◽

Genome Wide Association Studies ◽

False Discovery ◽

Genome Wide ◽

The Uk ◽

Negative Controls

We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms but makes no parametric assumptions about the unknown relation between genotypes and phenotype. Instead, we recognize that genotypes can be considered as a random sample from an appropriate model, encapsulating our knowledge of genetic inheritance and human populations. This allows the generation of imperfect copies (knockoffs) of these variables that serve as ideal negative controls, correcting for linkage disequilibrium and accounting for unknown population structure, which may be due to diverse ancestries or familial relatedness. The validity and effectiveness of our method are demonstrated by extensive simulations and by applications to the UK Biobank data. These analyses confirm our method is powerful relative to state-of-the-art alternatives, while comparisons with other studies validate most of our discoveries. Finally, fast software is made available for researchers to analyze Biobank-scale datasets.

Download Full-text

Genome-wide assessment of genetic diversity and population structure in Magnolia odoratissima based on SLAF-Seq

10.21203/rs.3.rs-747821/v1 ◽

2021 ◽

Author(s):

Tao Zhang ◽

Xue Li ◽

Shuilian He

Keyword(s):

Genetic Diversity ◽

Population Structure ◽

Gene Flow ◽

Natural Populations ◽

Environmental Parameters ◽

Candidate Snps ◽

Small Populations ◽

Nucleotide Polymorphisms ◽

High Genetic Diversity ◽

Genome Wide

Abstract Magnolia odoratissima is a highly threatened species with small populations and scattered distribution due to habitat fragmentation and human activity. The species is recognized as a Plant Species with Extremely Small Populations (PSESP) and is endemic to China. In the current study, the population structure and levels of genetic diversity of M. odoratissima in the five remaining natural populations and three cultivated populations were evaluated using single nucleotide polymorphisms (SNPs) derived from Specific-Locus Amplified Fragment Sequencing (SLAF-seq). A total of 180,650 SNP loci were found in seventy M. odoratissima individuals. The genome-wide Nei’s and Shannon’s nucleotide diversity indexes of the total M. odoratissima population were 0.3035 and 0.4695, respectively. The observed heterozygosity (Ho) and expected heterozygosity (He) were 0.1122 and 0.3011. Our results suggest that M. odoratissima has relatively high genetic diversity at the genomic level. FST and AMOVA indicated that high genetic differentiation existed among populations. A phylogenetic neighbor-joining tree, Bayesian model–based clustering and principal components analysis (PCA) all divided the studied M. odoratissima individuals into three distinct clusters. The Treemix analysis showed that there was low gene flow among the natural populations and a certain gene flow from the wild populations to the cultivated population (LS to KIB, and GN to JD). In addition, a total of 36 unique SNPs were detected as being significantly associated with environmental parameters (altitude, temperature and precipitation). These candidate SNPs were found to be involved in multiple pathways including several molecular functions and biological process, suggesting they may play key roles in environmental adaptation. Our results suggested that three distinct evolutionary significant units (ESUs) should be set up to conserve this critically endangered species.

Download Full-text

Whole genome sequencing of 56 Mimulus individuals illustrates population structure and local selection

10.1101/031575 ◽

2015 ◽

Cited By ~ 1

Author(s):

Joshua Robert Puzey ◽

John H Willis ◽

John K Kelly

Keyword(s):

Gene Flow ◽

Salt Spray ◽

Whole Genome ◽

Structural Polymorphism ◽

Data Types ◽

Multiple Data ◽

Genome Wide ◽

A Genome ◽

Local Selection ◽

Limited Dispersal

Across western North America, Mimulus guttatus exists as many local populations adapted to site-specific challenges including salt spray, temperature, water availability, and soil chemistry. Gene flow between locally adapted populations will effect genetic diversity in both local demes and across the larger meta-population. A single population of annual M. guttatus from Iron Mountain, Oregon (IM) has been extensively studied and we here building off this research by analyzing whole genome sequences from 34 inbred lines from IM in conjunction with sequences from 22 Mimulus individuals from across the geographic range. Three striking features of these data address hypotheses about migration and selection in a locally adapted population. First, we find very high intra-population polymorphism (synonymous π = 0.033). Variation outside genes may be even higher, but is difficult to estimate because excessive divergence affects read mapping. Second, IM exhibits a significantly positive genome-wide average for Tajima's D. This indicates allele frequencies are typically more intermediate than expected from neutrality, opposite the pattern observed in other species. Third, IM exhibits a distinctive haplotype structure. There is a genome-wide excess of positive associations between minor alleles; consistent with an important effect of gene flow from nearby Mimulus populations. The combination of multiple data types, including a novel, tree-based analytic method and estimates for structural polymorphism (inversions) from previous genetic mapping studies, illustrates how the balance of strong local selection, limited dispersal, and meta-population dynamics manifests across the genome.

Download Full-text

Denisovan Ancestry in East Eurasian and Native American Populations.

10.1101/017475 ◽

2015 ◽

Cited By ~ 1

Author(s):

Pengfei Qin ◽

Mark Stoneking

Keyword(s):

Native American ◽

Gene Flow ◽

New Guinea ◽

Modern Human ◽

Common Source ◽

Human Populations ◽

Snp Data ◽

Genome Wide ◽

Or Gene ◽

The Relationship

Although initial studies suggested that Denisovan ancestry was found only in modern human populations from island Southeast Asia and Oceania, more recent studies have suggested that Denisovan ancestry may be more widespread. However, the geographic extent of Denisovan ancestry has not been determined, and moreover the relationship between the Denisovan ancestry in Oceania and that elsewhere has not been studied. Here we analyze genome-wide SNP data from 2493 individuals from 221 worldwide populations, and show that there is a widespread signal of a very low level of Denisovan ancestry across Eastern Eurasian and Native American (EE/NA) populations. We also verify a higher level of Denisovan ancestry in Oceania than that in EE/NA; the Denisovan ancestry in Oceania is correlated with the amount of New Guinea ancestry, but not the amount of Australian ancestry, indicating that recent gene flow from New Guinea likely accounts for signals of Denisovan ancestry across Oceania. However, Denisovan ancestry in EE/NA populations is equally correlated with their New Guinea or their Australian ancestry, suggesting a common source for the Denisovan ancestry in EE/NA and Oceanian populations. Our results suggest that Denisovan ancestry in EE/NA is derived either from common ancestry with, or gene flow from, the common ancestor of New Guineans and Australians, indicating a more complex history involving East Eurasians and Oceanians than previously suspected.

Download Full-text