Genomic Targets of Positive Selection in Giant Mice from Gough Island

Molecular Biology and Evolution ◽

10.1093/molbev/msaa255 ◽

2020 ◽

Author(s):

Bret A Payseur ◽

Peicheng Jing

Keyword(s):

Positive Selection ◽

Reference Population ◽

Demographic Model ◽

Missense Mutations ◽

Applied Machine Learning ◽

Gough Island ◽

Regulatory Mutations ◽

Approximate Bayesian ◽

Genomic Regions ◽

Gene Ontologies

Abstract A key challenge in understanding how organisms adapt to their environments is to identify the mutations and genes that make it possible. By comparing patterns of sequence variation to neutral predictions across genomes, the targets of positive selection can be located. We applied this logic to house mice that invaded Gough Island, an unusual population that shows phenotypic and ecological hallmarks of selection. We used massively parallel short-read sequencing to survey the genomes of 14 Gough Island mice. We computed a set of summary statistics to capture diverse aspects of variation across these genome sequences, used approximate Bayesian computation to reconstruct a null demographic model, and then applied machine learning to estimate the posterior probability of positive selection in each region of the genome. Using a conservative threshold, 1,463 5kb windows show strong evidence for positive selection in Gough Island mice but not in a mainland reference population of German mice. Disproportionate shares of these selection windows contain genes that harbor derived nonsynonymous mutations with large frequency differences. Over-represented gene ontologies in selection windows emphasize neurological themes. Inspection of genomic regions harboring many selection windows with high posterior probabilities pointed to genes with known effects on exploratory behavior and body size as potential targets. Some genes in these regions contain candidate adaptive variants, including missense mutations and/or putative regulatory mutations. Our results provide a genomic portrait of adaptation to island conditions and position Gough Island mice as a powerful system for understanding the genetic component of natural selection.

Download Full-text

DILS: Demographic Inferences with Linked Selection by using ABC

10.1101/2020.06.15.151597 ◽

2020 ◽

Author(s):

Christelle Fraïsse ◽

Iva Popovic ◽

Clément Mazoyer ◽

Bruno Spataro ◽

Stéphane Delmotte ◽

...

Keyword(s):

Gene Flow ◽

Published Data ◽

Demographic Model ◽

Size Change ◽

Genetic Patterns ◽

And Migration ◽

Approximate Bayesian ◽

Genomic Regions ◽

Analysis Platform ◽

Linked Selection

ABSTRACTWe present DILS, a deployable statistical analysis platform for conducting demographic inferences with linked selection from population genomic data using an Approximate Bayesian Computation framework. DILS takes as input single-population or two-population datasets (multilocus fasta sequences) and performs three types of analyses in a hierarchical manner, identifying: 1) the best demographic model to study the importance of gene flow and population size change on the genetic patterns of polymorphism and divergence, 2) the best genomic model to determine whether the effective size Ne and migration rate N.m are heterogeneously distributed along the genome (implying linked selection) and 3) loci in genomic regions most associated with barriers to gene flow. Also available via a web interface, an objective of DILS is to facilitate collaborative research in speciation genomics. Here, we show the performance and limitations of DILS by using simulations, and finally apply the method to published data on a divergence continuum composed by 28 pairs of Mytilus mussel populations/species.

Download Full-text

Population genomics insights into the recent evolution of SARS-CoV-2

10.1101/2020.04.21.054122 ◽

2020 ◽

Cited By ~ 2

Author(s):

Maria Vasilarou ◽

Nikolaos Alachiotis ◽

Joanna Garefalaki ◽

Apostolos Beloukas ◽

Pavlos Pavlidis

Keyword(s):

Positive Selection ◽

Exponential Growth ◽

Approximate Bayesian Computation ◽

Computational Analysis ◽

Population Genomics ◽

Full Genome Sequence ◽

The Past ◽

Approximate Bayesian ◽

Genomic Regions ◽

Bat Coronavirus

AbstractThe current coronavirus disease 2019 (COVID-19) pandemic is caused by the SARS-CoV-2 virus and is still spreading rapidly worldwide. Full-genome-sequence computational analysis of the SARS-CoV-2 genome will allow us to understand the recent evolutionary events and adaptability mechanisms more accurately, as there is still neither effective therapeutic nor prophylactic strategy. In this study, we used population genetics analysis to infer the mutation rate and plausible recombination events that may have contributed to the evolution of the SARS-CoV-2 virus. Furthermore, we localized targets of recent and strong positive selection. The genomic regions that appear to be under positive selection are largely co-localized with regions in which recombination from non-human hosts appeared to have taken place in the past. Our results suggest that the pangolin coronavirus genome may have contributed to the SARS-CoV-2 genome by recombination with the bat coronavirus genome. However, we find evidence for additional recombination events that involve coronavirus genomes from other hosts, i.e., Hedgehog and Sparrow. Even though recombination events within human hosts cannot be directly assessed, due to the high similarity of SARS-CoV-2 genomes, we infer that recombinations may have recently occurred within human hosts using a linkage disequilibrium analysis. In addition, we employed an Approximate Bayesian Computation approach to estimate the parameters of a demographic scenario involving an exponential growth of the size of the SARS-CoV-2 populations that have infected European, Asian and Northern American cohorts, and we demonstrated that a rapid exponential growth in population size can support the observed polymorphism patterns in SARS-CoV-2 genomes.

Download Full-text

Admixture with indigenous people helps local adaptation: admixture-enabled selection in Polynesians

BMC Ecology and Evolution ◽

10.1186/s12862-021-01900-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Mariko Isshiki ◽

Izumi Naka ◽

Ryosuke Kimura ◽

Nao Nishida ◽

Takuro Furusawa ◽

...

Keyword(s):

Positive Selection ◽

Local Adaptation ◽

Indigenous People ◽

Cell Function ◽

Homo Sapiens ◽

Selective Neutrality ◽

The Mean ◽

Approximate Bayesian ◽

Candidate Regions ◽

Genomic Regions

Abstract Background Homo sapiens have experienced admixture many times in the last few thousand years. To examine how admixture affects local adaptation, we investigated genomes of modern Polynesians, who are shaped through admixture between Austronesian-speaking people from Southeast Asia (Asian-related ancestors) and indigenous people in Near Oceania (Papuan-related ancestors). Methods In this study local ancestry was estimated across the genome in Polynesians (23 Tongan subjects) to find the candidate regions of admixture-enabled selection contributed by Papuan-related ancestors. Results The mean proportion of Papuan-related ancestry across the Polynesian genome was estimated as 24.6% (SD = 8.63%), and two genomic regions, the extended major histocompatibility complex (xMHC) region on chromosome 6 and the ATP-binding cassette transporter sub-family C member 11 (ABCC11) gene on chromosome 16, showed proportions of Papuan-related ancestry more than 5 SD greater than the mean (> 67.8%). The coalescent simulation under the assumption of selective neutrality suggested that such signals of Papuan-related ancestry enrichment were caused by positive selection after admixture (false discovery rate = 0.045). The ABCC11 harbors a nonsynonymous SNP, rs17822931, which affects apocrine secretory cell function. The approximate Bayesian computation indicated that, in Polynesian ancestors, a strong positive selection (s = 0.0217) acted on the ancestral allele of rs17822931 derived from Papuan-related ancestors. Conclusions Our results suggest that admixture with Papuan-related ancestors contributed to the rapid local adaptation of Polynesian ancestors. Considering frequent admixture events in human evolution history, the acceleration of local adaptation through admixture should be a common event in humans.

Download Full-text

Discovery and refinement of muscle weight QTLs in B6 × D2 advanced intercross mice

Physiological Genomics ◽

10.1152/physiolgenomics.00055.2014 ◽

2014 ◽

Vol 46 (16) ◽

pp. 571-582 ◽

Cited By ~ 9

Author(s):

P. Carbonetto ◽

R. Cheng ◽

J. P. Gyekis ◽

C. C. Parker ◽

D. A. Blizard ◽

...

Keyword(s):

Inbred Strains ◽

Mouse Strains ◽

Missense Mutations ◽

Muscle Weight ◽

Hindlimb Muscles ◽

Hindlimb Muscle ◽

Causal Genes ◽

Snp Panel ◽

Genomic Regions ◽

Genetic Contributions

The genes underlying variation in skeletal muscle mass are poorly understood. Although many quantitative trait loci (QTLs) have been mapped in crosses of mouse strains, the limited resolution inherent in these conventional studies has made it difficult to reliably pinpoint the causal genetic variants. The accumulated recombination events in an advanced intercross line (AIL), in which mice from two inbred strains are mated at random for several generations, can improve mapping resolution. We demonstrate these advancements in mapping QTLs for hindlimb muscle weights in an AIL ( n = 832) of the C57BL/6J (B6) and DBA/2J (D2) strains, generations F8–F13. We mapped muscle weight QTLs using the high-density MegaMUGA SNP panel. The QTLs highlight the shared genetic architecture of four hindlimb muscles and suggest that the genetic contributions to muscle variation are substantially different in males and females, at least in the B6D2 lineage. Out of the 15 muscle weight QTLs identified in the AIL, nine overlapped the genomic regions discovered in an earlier B6D2 F2 intercross. Mapping resolution, however, was substantially improved in our study to a median QTL interval of 12.5 Mb. Subsequent sequence analysis of the QTL regions revealed 20 genes with nonsense or potentially damaging missense mutations. Further refinement of the muscle weight QTLs using additional functional information, such as gene expression differences between alleles, will be important for discerning the causal genes.

Download Full-text

Faculty Opinions recommendation of Population genetic and phylogenetic evidence for positive selection on regulatory mutations at the factor VII locus in humans.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1024813.292862 ◽

2005 ◽

Author(s):

Thomas Mitchell-Olds

Keyword(s):

Positive Selection ◽

Population Genetic ◽

Factor Vii ◽

Regulatory Mutations

Download Full-text

Sardinians Genetic Background Explained by Runs of Homozygosity and Genomic Regions under Positive Selection

PLoS ONE ◽

10.1371/journal.pone.0091237 ◽

2014 ◽

Vol 9 (3) ◽

pp. e91237 ◽

Cited By ~ 27

Author(s):

Cornelia Di Gaetano ◽

Giovanni Fiorito ◽

Maria Francesca Ortu ◽

Fabio Rosa ◽

Simonetta Guarrera ◽

...

Keyword(s):

Positive Selection ◽

Genetic Background ◽

Runs Of Homozygosity ◽

Genomic Regions

Download Full-text

RAPID COMMUNICATION: Multi-breed validation study unraveled genomic regions associated with puberty traits segregating across tropically adapted breeds1

Journal of Animal Science ◽

10.1093/jas/skz121 ◽

2019 ◽

Vol 97 (7) ◽

pp. 3027-3033 ◽

Cited By ~ 4

Author(s):

Thaise P Melo ◽

Marina R S Fortes ◽

Gerardo A Fernandes Junior ◽

Lucia G Albuquerque ◽

Roberto Carvalheiro

Keyword(s):

Candidate Genes ◽

Genome Wide Association Study ◽

Meta Analysis ◽

Reference Population ◽

High Linkage Disequilibrium ◽

P Value ◽

Early Puberty ◽

Sexual Precocity ◽

Tropical Conditions ◽

Genomic Regions

Abstract An efficient strategy to improve QTL detection power is performing across-breed validation studies. Variants segregating across breeds are expected to be in high linkage disequilibrium (LD) with causal mutations affecting economically important traits. The aim of this study was to validate, in a Tropical Composite cattle (TC) population, QTL associations identified for sexual precocity traits in a Nellore and Brahman meta-analysis genome-wide association study. In total, 2,816 TC, 8,001 Nellore, and 2,210 Brahman animals were available for the analysis. For that, genomic regions significantly associated with puberty traits in the meta-analysis study were validated for the following sexual precocity traits in TC: age at first corpus luteum (AGECL), first postpartum anestrus interval (PPAI), and scrotal circumference at 18 months of age (SC). We considered validated QTL those underpinned by significant markers from the Nellore and Brahman meta-analysis (P ≤ 10–4) that were also significant for a TC trait, i.e., presenting a P-value of ≤10–3 for AGECL, PPAI, or SC. We also considered as validated QTL those regions where significant markers in the reference population were at ±250 kb from significant markers in the validation population. Using this criteria, 49 SNP were validated for AGECL, 4 for PPAI, and 14 for SC, from which 5 were in common with AGECL, totaling 62 validated SNP for these traits and 30 candidate genes surrounding them. Considering just candidate genes closest to the top SNP of each chromosome, for AGECL 8 candidate genes were identified: COL8A1, PENK, ENSBTAG00000047425, BPNT1, ADAMTS17, CCHCR1, SUFU, and ENSBTAG00000046374. For PPAI, 3 genes emerged as candidates (PCBP3, KCNK10, and MRPS5), and for SC 8 candidate genes were identified (SNORA70, TRAC, ASS1, BPNT1, LRRK1, PKHD1, PTPRM, and ENSBTAG00000045690). Several candidate regions presented here were previously associated with puberty traits in cattle. The majority of emerging candidate genes are related to biological processes involved in reproductive events, such as maintenance of gestation, and some are known to be expressed in reproductive tissues. Our results suggested that some QTL controlling early puberty seem to be segregating across cattle breeds adapted to tropical conditions.

Download Full-text

ImaGene: a convolutional neural network to quantify natural selection from genomic data

BMC Bioinformatics ◽

10.1186/s12859-019-2927-x ◽

2019 ◽

Vol 20 (S9) ◽

Cited By ~ 6

Author(s):

Luis Torada ◽

Lucrezia Lorenzon ◽

Alice Beddis ◽

Ulas Isildak ◽

Linda Pattini ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Selection ◽

Convolutional Neural Network ◽

Positive Selection ◽

Genomic Data ◽

Demographic Model ◽

Genomic Information ◽

Joint Inference ◽

Genetic Bases

Abstract Background The genetic bases of many complex phenotypes are still largely unknown, mostly due to the polygenic nature of the traits and the small effect of each associated mutation. An alternative approach to classic association studies to determining such genetic bases is an evolutionary framework. As sites targeted by natural selection are likely to harbor important functionalities for the carrier, the identification of selection signatures in the genome has the potential to unveil the genetic mechanisms underpinning human phenotypes. Popular methods of detecting such signals rely on compressing genomic information into summary statistics, resulting in the loss of information. Furthermore, few methods are able to quantify the strength of selection. Here we explored the use of deep learning in evolutionary biology and implemented a program, called , to apply convolutional neural networks on population genomic data for the detection and quantification of natural selection. Results enables genomic information from multiple individuals to be represented as abstract images. Each image is created by stacking aligned genomic data and encoding distinct alleles into separate colors. To detect and quantify signatures of positive selection, implements a convolutional neural network which is trained using simulations. We show how the method implemented in can be affected by data manipulation and learning strategies. In particular, we show how sorting images by row and column leads to accurate predictions. We also demonstrate how the misspecification of the correct demographic model for producing training data can influence the quantification of positive selection. We finally illustrate an approach to estimate the selection coefficient, a continuous variable, using multiclass classification techniques. Conclusions While the use of deep learning in evolutionary genomics is in its infancy, here we demonstrated its potential to detect informative patterns from large-scale genomic data. We implemented methods to process genomic data for deep learning in a user-friendly program called . The joint inference of the evolutionary history of mutations and their functional impact will facilitate mapping studies and provide novel insights into the molecular mechanisms associated with human phenotypes.

Download Full-text

Genome-Wide Runs of Homozygosity, Effective Population Size, and Detection of Positive Selection Signatures in Six Chinese Goat Breeds

Genes ◽

10.3390/genes10110938 ◽

2019 ◽

Vol 10 (11) ◽

pp. 938 ◽

Cited By ~ 4

Author(s):

Islam ◽

Li ◽

Liu ◽

Berihulay ◽

Abied ◽

...

Keyword(s):

Genetic Diversity ◽

Genetic Differentiation ◽

Positive Selection ◽

Candidate Genes ◽

Runs Of Homozygosity ◽

Selection Signatures ◽

Effective Population ◽

Signature Of Selection ◽

Genomic Regions ◽

Insight Into

: Detection of selection footprints provides insight into the evolution process and the underlying mechanisms controlling the phenotypic diversity of traits that have been exposed to selection. Selection focused on certain characters, mapping certain genomic regions often shows a loss of genetic diversity with an increased level of homozygosity. Therefore, the runs of homozygosity (ROHs), homozygosity by descent (HBD), and effective population size (Ne) are effective tools for exploring the genetic diversity, understanding the demographic history, foretelling the signature of directional selection, and improving the breeding strategies to use and conserve genetic resources. We characterized the ROH, HBD, Ne, and signature of selection of six Chinese goat populations using single nucleotide polymorphism (SNP) 50K Illumina beadchips. Our results show an inverse relationship between the length and frequency of ROH. A long ROH length, higher level of inbreeding, long HBD segment, and smaller Ne in Guangfeng (GF) goats suggested intensive selection pressure and recent inbreeding in this breed. We identified six reproduction-related genes within the genomic regions with a high ROH frequency, of which two genes overlapped with a putative selection signature. The estimated pair-wise genetic differentiation (FST) among the populations is 9.60% and the inter- and intra-population molecular variations are 9.68% and 89.6%, respectively, indicating low to moderate genetic differentiation. Our selection signatures analysis revealed 54 loci harboring 86 putative candidate genes, with a strong signature of selection. Further analysis showed that several candidate genes, including MARF1, SYCP2, TMEM200C, SF1, ADCY1, and BMP5, are involved in goat fecundity. We identified 11 candidate genes by using cross-population extended haplotype homozygosity (XP-EHH) estimates, of which MARF1 and SF1 are under strong positive selection, as they are differentiated in high and low reproduction groups according to the three approaches used. Gene ontology enrichment analysis revealed that different biological pathways could be involved in the variation of fecundity in female goats. This study provides a new insight into the ROHs patterns for maintenance of within breed diversity and suggests a role of positive selection for genetic variation influencing fecundity in Chinese goat.

Download Full-text