scholarly journals Fine-mapping the Favored Mutation in a Positive Selective Sweep

2017 ◽  
Author(s):  
Ali Akbari ◽  
Joseph J. Vitti ◽  
Arya Iranmehr ◽  
Mehrdad Bakhtiari ◽  
Pardis C. Sabeti ◽  
...  

AbstractMethods to identify signatures of selective sweeps in population genomics data have been actively developed, but mostly do not identify the specific mutation favored by the selective sweep. We present a method, iSAFE, that uses a statistic derived solely from population genetics signals to pinpoint the favored mutation even when the signature of selection extends to 5Mbp. iSAFE was tested extensively on simulated data and in human populations from the 1000 Genomes Project, at 22 loci with previously characterized selective sweeps. For 14 of the 22 loci, iSAFE ranked the previously characterized candidate mutation among the 13 highest scoring (out of ∼ 21, 000 variants). Three loci did not show a strong signal. For the remaining loci, iSAFE identified previously unreported mutations as being favored. In these regions, all of which involve pigmentation related genes, iSAFE identified identical selected mutations in multiple non-African populations suggesting an out-of-Africa onset of selection. The iSAFE software can be downloaded from https://github.com/alek0991/iSAFE.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Mary Elizabeth Mathyer ◽  
Erin A. Brettmann ◽  
Alina D. Schmidt ◽  
Zane A. Goodwin ◽  
Inez Y. Oh ◽  
...  

AbstractThe genetic modules that contribute to human evolution are poorly understood. Here we investigate positive selection in the Epidermal Differentiation Complex locus for skin barrier adaptation in diverse HapMap human populations (CEU, JPT/CHB, and YRI). Using Composite of Multiple Signals and iSAFE, we identify selective sweeps for LCE1A-SMCP and involucrin (IVL) haplotypes associated with human migration out-of-Africa, reaching near fixation in European populations. CEU-IVL is associated with increased IVL expression and a known epidermis-specific enhancer. CRISPR/Cas9 deletion of the orthologous mouse enhancer in vivo reveals a functional requirement for the enhancer to regulate Ivl expression in cis. Reporter assays confirm increased regulatory and additive enhancer effects of CEU-specific polymorphisms identified at predicted IRF1 and NFIC binding sites in the IVL enhancer (rs4845327) and its promoter (rs1854779). Together, our results identify a selective sweep for a cis regulatory module for CEU-IVL, highlighting human skin barrier evolution for increased IVL expression out-of-Africa.


2019 ◽  
Author(s):  
Anna Tigano ◽  
Jocelyn P. Colella ◽  
Matthew D. MacManes

AbstractOrganisms that live in deserts offer the opportunity to investigate how species adapt to environmental conditions that are lethal to most plants and animals. In the hot deserts of North America, high temperatures and lack of water are conspicuous challenges for organisms living there. The cactus mouse (Peromyscus eremicus) displays several adaptations to these conditions, including low metabolic rate, heat tolerance, and the ability to maintain homeostasis under extreme dehydration. To investigate the genomic basis of desert adaptation in cactus mice, we built a chromosome-level genome assembly and resequenced 26 additional cactus mouse genomes from two locations in southern California (USA). Using these data, we integrated comparative, population, and functional genomic approaches. We identified 16 gene families exhibiting significant contractions or expansions in the cactus mouse compared to 17 other Myodontine rodent genomes, and found 232 sites across the genome associated with selective sweeps. Functional annotations of candidate gene families and selective sweeps revealed a pervasive signature of selection at genes involved in the synthesis and degradation of proteins, consistent with the evolution of cellular mechanisms to cope with protein denaturation caused by thermal and hyperosmotic stress. Other strong candidate genes included receptors for bitter taste, suggesting a dietary shift towards chemically defended desert plants and insects, and a growth factor involved in lipid metabolism, potentially involved in prevention of dehydration. Understanding how species adapted to the recent emergence of deserts in North America will provide an important foundation for predicting future evolutionary responses to increasing temperatures, droughts and desertification in the cactus mouse and other species.


2020 ◽  
Vol 37 (10) ◽  
pp. 3023-3046
Author(s):  
Alexandre M Harris ◽  
Michael DeGiorgio

Abstract Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data.


2013 ◽  
Vol 45 (15) ◽  
pp. 667-683 ◽  
Author(s):  
Jessica H. Geahlen ◽  
Carlo Lapid ◽  
Kaisa Thorell ◽  
Igor Nikolskiy ◽  
Won Jae Huh ◽  
...  

In a screen for genes expressed specifically in gastric mucous neck cells, we identified GKN3, the recently discovered third member of the gastrokine family. We present confirmatory mouse data and novel porcine data showing that mouse GKN3 expression is confined to mucous cells of the corpus neck and antrum base and is prominently expressed in metaplastic lesions. GKN3 was proposed originally to be expressed in some human populations and a pseudogene in others. To investigate that hypothesis, we studied human GKN3 evolution in the context of its paralogous genomic neighbors, GKN1 and GKN2. Haplotype analysis revealed that GKN3 mimics GKN2 in patterns of exonic SNP allocation, whereas GKN1 appeared to be more stringently selected. GKN3 showed signatures of both directional selection and population based selective sweeps in humans. One such selective sweep includes SNP rs10187256, originally identified as an ancestral tryptophan to premature STOP codon mutation. The derived (nonancestral) allele went to fixation in Asia. We show that another SNP, rs75578132, identified 5 bp downstream of rs10187256, exhibits a second selective sweep in almost all Europeans, some Latinos, and some Africans, possibly resulting from a reintroduction of European genes during African colonization. Finally, we identify a mutation that would destroy the splice donor site in the putative exon3-intron3 boundary, which occurs in all human genomes examined to date. Our results highlight a stomach-specific human genetic locus, which has undergone various selective sweeps across European, Asian, and African populations and thus reflects geographic and ethnic patterns in genome evolution.


Genetics ◽  
2002 ◽  
Vol 160 (2) ◽  
pp. 753-763 ◽  
Author(s):  
Christian Schlötterer

AbstractWith the availability of completely sequenced genomes, multilocus scans of natural variability have become a feasible approach for the identification of genomic regions subjected to natural and artificial selection. Here, I introduce a new multilocus test statistic, ln RV, which is based on the ratio of observed variances in repeat number at a set of microsatellite loci in two groups of populations. The distribution of ln RV values captures demographic history of the populations as well as variation in microsatellite mutation among loci. Given that microsatellite loci associated with a recent selective sweep differ from the remainder of the genome, they are expected to fall outside of the distribution of neutral ln RV values. The ln RV test statistic is applied to a data set of 94 loci typed in eight non-African and two African human populations.


2019 ◽  
Author(s):  
Clement Goubert ◽  
Jainy Thomas ◽  
Lindsay M. Payer ◽  
Jeffrey M. Kidd ◽  
Julie Feusier ◽  
...  

ABSTRACTAlu retrotransposons account for more than 10% of the human genome, and insertions of these elements create structural variants segregating in human populations. Such polymorphic Alu are powerful markers to understand population structure, and they represent variants that can greatly impact genome function, including gene expression. Accurate genotyping of Alu and other mobile elements has been challenging. Indeed, we found that Alu genotypes previously called for the 1000 Genomes Project are sometimes erroneous, which poses significant problems for phasing these insertions with other variants that comprise the haplotype. To ameliorate this issue, we introduce a new pipeline -- TypeTE -- which genotypes Alu insertions from whole-genome sequencing data. Starting from a list of polymorphic Alus, TypeTE identifies the hallmarks (poly-A tail and target site duplication) and orientation of Alu insertions using local re-assembly to reconstruct presence and absence alleles. Genotype likelihoods are then computed after re-mapping sequencing reads to the reconstructed alleles. Using a ‘gold standard’ set of PCR-based genotyping of >200 loci, we show that TypeTE improves genotype accuracy from 83% to 92% in the 1000 Genomes dataset. TypeTE can be readily adapted to other retrotransposon families and brings a valuable toolbox addition for population genomics.


Genetics ◽  
2003 ◽  
Vol 165 (3) ◽  
pp. 1137-1148
Author(s):  
M O Kauer ◽  
D Dieringer ◽  
C Schlötterer

Abstract We report a “hitchhiking mapping” study in D. melanogaster, which searches for genomic regions with reduced variability. The study's aim was to identify selective sweeps associated with the “out of Africa” habitat expansion. We scanned 103 microsatellites on chromosome 3 and 102 microsatellites on the X chromosome for reduced variability in non-African populations. When the chromosomes were analyzed separately, the number of loci with a significant reduction in variability only slightly exceeded the expectation under neutrality—six loci on the third chromosome and four loci on the X chromosome. However, non-African populations also have a more pronounced average loss in variability on the X chromosomes as compared to the third chromosome, which suggests the action of selection. Therefore, comparing the X chromosome to the autosome yields a higher number of significantly reduced loci. However, a more pronounced loss of variability on the X chromosome may be caused by demographic events rather than by natural selection. We therefore explored a range of demographic scenarios and found that some of these captured most, but not all aspects of our data. More theoretical work is needed to evaluate how demographic events might differentially affect X chromosomes and autosomes and to estimate the most likely scenario associated with the out of Africa expansion of D. melanogaster.


2019 ◽  
Author(s):  
William Amos

AbstractThe idea that humans interbred with other Hominins, most notably Neanderthals, is now accepted as fact. The finding of hybrid skeletons shows that fertile matings did occur. However, inferences about the size of the resulting legacy assume that back-mutations are rare enough to be ignored and that mutation rate does not vary. In reality, back-mutations are common, mutation rate does vary between populations and there is mounting evidence that heterozygosity and mutation rate covary. If so, the large loss of heterozygosity that occurred when humans migrated out of Africa would have reduced the mutation rate, leaving Africans to diverge faster from our common ancestor and from related lineages like Neanderthals. To test whether this idea impacts estimates of introgressed fraction, I calculated D, a measure of relative base-sharing with Neanderthals, and heterozygosity difference between all pairwise combinations of populations in the 1000 genomes Phase 3 data. D and heterozygosity difference are ubiquitously negatively correlated across all comparisons, between all regions and even between populations within each major region including Africa. In addition, the larger sample of populations in the Simons Genome Diversity project reveals a pan-Eurasian correlation between Neanderthal and Denisovan fraction. These correlations challenge a simple hybridisation model but do seem consistent with a model where more heterozygous human populations tend to diverge faster from Neanderthals than populations with lower heterozygosity. Indeed, the strongest correlation between Neanderthal content and geography indicates and origin where humans likely left Africa, exactly mimicking the pattern seen for loss of heterozygosity. Such a model explains why evidence for inter-breeding is found more or less wherever archaic and human populations are compared. How much of variation in D is due to introgression and how much is due to heterozygosity-mediated variation in mutation rate remains to be determined.Author summaryThe idea that humans inter-bred with related lineages such as Neanderthals, leaving an appreciable legacy in modern genomes, has rapidly progressed from shocking revelation to accepted dogma. My analysis explores an alternative model in which mutation rate slowed when diversity was lost in a population bottleneck as humans moved out of Africa to colonise the world. I find that, across Eurasia, the size of inferred legacy closely matches the pattern of diversity loss but shows no relationship to where human and Neanderthal populations likely overlapped. My results do not challenge the idea that some inter-breeding occurred, but they do indicate that some, much or even most of the signal that has be attributed entirely to archaic legacies, arises from unexpected variation in mutation rate. More generally, my analysis helps explain why inter-breeding is inferred almost wherever tests are conducted even though most species avoid hybridisation.


2020 ◽  
Vol 48 (6) ◽  
pp. e36-e36 ◽  
Author(s):  
Clément Goubert ◽  
Jainy Thomas ◽  
Lindsay M Payer ◽  
Jeffrey M Kidd ◽  
Julie Feusier ◽  
...  

Abstract Alu retrotransposons account for more than 10% of the human genome, and insertions of these elements create structural variants segregating in human populations. Such polymorphic Alus are powerful markers to understand population structure, and they represent variants that can greatly impact genome function, including gene expression. Accurate genotyping of Alus and other mobile elements has been challenging. Indeed, we found that Alu genotypes previously called for the 1000 Genomes Project are sometimes erroneous, which poses significant problems for phasing these insertions with other variants that comprise the haplotype. To ameliorate this issue, we introduce a new pipeline – TypeTE – which genotypes Alu insertions from whole-genome sequencing data. Starting from a list of polymorphic Alus, TypeTE identifies the hallmarks (poly-A tail and target site duplication) and orientation of Alu insertions using local re-assembly to reconstruct presence and absence alleles. Genotype likelihoods are then computed after re-mapping sequencing reads to the reconstructed alleles. Using a high-quality set of PCR-based genotyping of >200 loci, we show that TypeTE improves genotype accuracy from 83% to 92% in the 1000 Genomes dataset. TypeTE can be readily adapted to other retrotransposon families and brings a valuable toolbox addition for population genomics.


2015 ◽  
Author(s):  
Roy Ronen ◽  
Glenn Tesler ◽  
Ali Akbari ◽  
Shay Zakov ◽  
Noah A Rosenberg ◽  
...  

Methods for detecting the genomic signatures of natural selection have been heavily studied, and they have been successful in identifying many selective sweeps. For most of these sweeps, the favored allele remains unknown, making it difficult to distinguish carriers of the sweep from non-carriers. In an ongoing selective sweep, carriers of the favored allele are likely to contain a future most recent common ancestor. Therefore, identifying them may prove useful in predicting the evolutionary trajectory — for example, in contexts involving drug-resistant pathogen strains or cancer subclones. The main contribution of this paper is the development and analysis of a new statistic, the Haplotype Allele Frequency (HAF) score. The HAF score, assigned to individual haplotypes in a sample, naturally captures many of the properties shared by haplotypes carrying a favored allele. We provide a theoretical framework for computing expected HAF scores under different evolutionary scenarios, and we validate the theoretical predictions with simulations. As an application of HAF score computations, we develop an algorithm (PreCIOSS: Predicting Carriers of Ongoing Selective Sweeps) to identify carriers of the favored allele in selective sweeps, and we demonstrate its power on simulations of both hard and soft sweeps, as well as on data from well-known sweeps in human populations.


Sign in / Sign up

Export Citation Format

Share Document