scholarly journals BetaScan2: Standardized statistics to detect balancing selection utilizing substitution data:

2018 ◽  
Author(s):  
Katherine M Siewert ◽  
Benjamin F Voight

The recently reported statistic β detects balanced haplotypes without the need for genomic data from an outgroup species. Here we present an extension to the method that incorporates between-species substitution data into the β statistic framework. We show that this approach outperforms existing summary statistics in simulations. We also present the variance of β with and without substitution data, allowing calculation of a standardized score. Besides providing a measure of significance, this enables a proper comparison of β values across varying underlying parameters, a feature lacking from some related methods.

Author(s):  
Vivak Soni ◽  
Michiel Vos ◽  
Adam Eyre-Walker

AbstractThe role that balancing selection plays in the maintenance of genetic diversity remains unresolved. Here we introduce a new test, based on the McDonald-Kreitman test, in which the number of polymorphisms that are shared between populations is contrasted to those that are private at selected and neutral sites. We show that this simple test is robust to a variety of demographic changes, and that it can also give a direct estimate of the number of shared polymorphisms that are directly maintained by balancing selection. We apply our method to population genomic data from humans and conclude that more than a thousand non-synonymous polymorphisms are subject to balancing selection.


2021 ◽  
Author(s):  
Cooper Alastair Grace ◽  
Sarah Forrester ◽  
Vladimir Costa Silva ◽  
Aleksander Aare ◽  
Hannah Kilford ◽  
...  

AbstractThe Leishmania donovani species complex are the causative agents of visceral leishmaniasis, which cause 20-40,000 fatalities a year. Here, we conduct a screen for balancing selection in this specie complex. We sequence 93 isolates of L. infantum from Brazil and used 387 publicly-available L. donovani and L. infantum genomes, to describe the global diversity of this species complex. We identify five genetically-distinct populations that are sufficiently represented by genomic data to search for signatures of selection. We show that multiple metrics identify genes with robust signatures of balancing selection. We produce a curated set of 19 genes with robust signatures, including zeta toxin, nodulin-like and flagellum attachment proteins. Candidate genes were generally not shared between populations, consistent with divergent rather than long-term balancing selection in these species. This study highlights the extent of genetic divergence between L. donovani complex parasites and provides candidate genes for further study.


Author(s):  
Ulas Isildak ◽  
Alessandro Stella ◽  
Matteo Fumagalli

1AbstractBalancing selection is an important adaptive mechanism underpinning a wide range of phenotypes. Despite its relevance, the detection of recent balancing selection from genomic data is challenging as its signatures are qualitatively similar to those left by ongoing positive selection. In this study we developed and implemented two deep neural networks and tested their performance to predict loci under recent selection, either due to balancing selection or incomplete sweep, from population genomic data. Specifically, we generated forward-intime simulations to train and test an artificial neural network (ANN) and a convolutional neural network (CNN). ANN received as input multiple summary statistics calculated on the locus of interest, while CNN was applied directly on the matrix of haplotypes. We found that both architectures have high accuracy to identify loci under recent selection. CNN generally outperformed ANN to distinguish between signals of balancing selection and incomplete sweep and was less affected by incorrect training data. We deployed both trained networks on neutral genomic regions in European populations and demonstrated a lower false positive rate for CNN than ANN. We finally deployed CNN within the MEFV gene region and identified several common variants predicted to be under incomplete sweep in a European population. Notably, two of these variants are functional changes and could modulate susceptibility to Familial Mediterranean Fever, possibly as a consequence of past adaptation to pathogens. In conclusion, deep neural networks were able to characterise signals of selection on intermediate-frequency variants, an analysis currently inaccessible by commonly used strategies.


2020 ◽  
Vol 37 (11) ◽  
pp. 3267-3291 ◽  
Author(s):  
Xiaoheng Cheng ◽  
Michael DeGiorgio

Abstract Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer substantial losses in power when windows are large. Here, we employ mixture models to construct a set of five composite likelihood ratio test statistics, which we collectively term B statistics. These statistics are agnostic to window sizes and can operate on diverse forms of input data. Through simulations, we show that they exhibit comparable power to the best-performing current methods, and retain substantially high power regardless of window sizes. They also display considerable robustness to high mutation rates and uneven recombination landscapes, as well as an array of other common confounding scenarios. Moreover, we applied a specific version of the B statistics, termed B2, to a human population-genomic data set and recovered many top candidates from prior studies, including the then-uncharacterized STPG2 and CCDC169–SOHLH2, both of which are related to gamete functions. We further applied B2 on a bonobo population-genomic data set. In addition to the MHC-DQ genes, we uncovered several novel candidate genes, such as KLRD1, involved in viral defense, and SCN9A, associated with pain perception. Finally, we show that our methods can be extended to account for multiallelic balancing selection and integrated the set of statistics into open-source software named BalLeRMix for future applications by the scientific community.


2020 ◽  
Author(s):  
Theodore G. Drivas ◽  
Anastasia Lucas ◽  
Marylyn D. Ritchie

SummaryGenomic studies increasingly integrate expression quantitative trait loci (eQTL) information into their analysis pipelines, but few tools exist for the visualization of colocalization between eQTL and GWAS results. To address this issue, we developed the intuitive R package eQTpLot, which takes as input GWAS and eQTL summary statistics to generate a series of plots visualizing colocalization, correlation, and enrichment between eQTL and GWAS signals for a given gene-trait pair. We believe eQTpLot will prove a useful tool for investigators seeking a convenient and customizable visualization of genomic data colocalization.Availability and Implementationthe eQTpLot R package and tutorial are available at https://github.com/RitchieLab/[email protected]


2020 ◽  
Vol 12 (2) ◽  
pp. 3873-3877 ◽  
Author(s):  
Katherine M Siewert ◽  
Benjamin F Voight

Abstract Long-term balancing selection results in a build-up of alleles at similar frequencies and a deficit of substitutions when compared with an outgroup at a locus. The previously published β(1) statistics detect balancing selection using only polymorphism data. We now propose the β(2) statistic which detects balancing selection using both polymorphism and substitution data. In addition, we derive the variance of all β statistics, allowing for their standardization and thereby reducing the influence of parameters which can confound other selection tests. The standardized β statistics outperform existing summary statistics in simulations, indicating β is a well-powered and widely applicable approach for detecting balancing selection. We apply the β(2) statistic to 1000 Genomes data and report two missense mutations with high β scores in the ACSBG2 gene. An implementation of all β statistics and their standardization are available in the BetaScan2 software package at https://github.com/ksiewert/BetaScan.


2020 ◽  
Author(s):  
Jean Cury ◽  
Benjamin C. Haller ◽  
Guillaume Achaz ◽  
Flora Jay

Simulation of genomic data is a key tool in population genetics, yet, to date, there is no forward-in-time simulator of bacterial populations that is both computationally efficient and adaptable to a wide range of scenarios. Here we demonstrate how to simulate bacterial populations with SLiM, a forward-in-time simulator built for eukaryotes. SLiM has gained many users in recent years, due to its speed and power, and has extensive documentation showcasing various scenarios that it can simulate. This paper focuses on a simple demographic scenario, to explore unique aspects of modeling bacteria in SLiM's scripting language. In addition, we illustrate the flexibility of SLiM by simulating the growth of bacteria on a Petri dish with antibiotic. To foster the development of bacterial simulations based upon this recipe, we explain the inner workings of its code. We also validate the simulator, by extensively testing the results of simulations against existing simulators, and against theoretical expectations for some summary statistics. This protocol, with the flexibility and power of SLiM, will enable the community to simulate bacterial populations efficiently under a wide range of evolutionary scenarios.


2012 ◽  
Author(s):  
Mouna Attarha ◽  
Shaun P. Vecera ◽  
Cathleen M. Moore
Keyword(s):  

2001 ◽  
Vol 88 (3) ◽  
pp. 1130
Author(s):  
RICHARD A. CHARTER
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document