scholarly journals Applicability of the Mutation–Selection Balance Model to Population Genetics of Heterozygous Protein-Truncating Variants in Humans

2019 ◽  
Vol 36 (8) ◽  
pp. 1701-1710 ◽  
Author(s):  
Donate Weghorn ◽  
Daniel J Balick ◽  
Christopher Cassa ◽  
Jack A Kosmicki ◽  
Mark J Daly ◽  
...  

Abstract The fate of alleles in the human population is believed to be highly affected by the stochastic force of genetic drift. Estimation of the strength of natural selection in humans generally necessitates a careful modeling of drift including complex effects of the population history and structure. Protein-truncating variants (PTVs) are expected to evolve under strong purifying selection and to have a relatively high per-gene mutation rate. Thus, it is appealing to model the population genetics of PTVs under a simple deterministic mutation–selection balance, as has been proposed earlier (Cassa et al. 2017). Here, we investigated the limits of this approximation using both computer simulations and data-driven approaches. Our simulations rely on a model of demographic history estimated from 33,370 individual exomes of the Non-Finnish European subset of the ExAC data set (Lek et al. 2016). Additionally, we compared the African and European subset of the ExAC study and analyzed de novo PTVs. We show that the mutation–selection balance model is applicable to the majority of human genes, but not to genes under the weakest selection.

2018 ◽  
Author(s):  
Donate Weghorn ◽  
Daniel J. Balick ◽  
Christopher Cassa ◽  
Jack Kosmicki ◽  
Mark J. Daly ◽  
...  

AbstractThe fate of alleles in the human population is believed to be highly affected by the stochastic force of genetic drift. Estimation of the strength of natural selection in humans generally necessitates a careful modeling of drift including complex effects of the population history and structure. Protein truncating variants (PTVs) are expected to evolve under strong purifying selection and to have a relatively high per-gene mutation rate. Thus, it is appealing to model the population genetics of PTVs under a simple deterministic mutation-selection balance, as has been proposed earlier [1]. Here, we investigated the limits of this approximation using both computer simulations and data-driven approaches. Our simulations rely on a model of demographic history estimated from 33,370 individual exomes of the Non-Finnish European subset of the ExAC dataset [2]. Additionally, we compared the African and European subset of the ExAC study and analyzed de novo PTVs. We show that the mutation-selection balance model is applicable to the majority of human genes, but not to genes under the weakest selection.


2017 ◽  
Author(s):  
Hilary C. Martin ◽  
Elizabeth M. Batty ◽  
Julie Hussin ◽  
Portia Westall ◽  
Tasman Daish ◽  
...  

AbstractThe platypus is an egg-laying mammal which, alongside the echidna, occupies a unique place in the mammalian phylogenetic tree. Despite widespread interest in its unusual biology, little is known about its population structure or recent evolutionary history. To provide new insights into the dispersal and demographic history of this iconic species, we sequenced the genomes of 57 platypuses from across the whole species range in eastern mainland Australia and Tasmania. Using a highly-improved reference genome, we called over 6.7M SNPs, providing an informative genetic data set for population analyses. Our results show very strong population structure in the platypus, with our sampling locations corresponding to discrete groupings between which there is no evidence for recent gene flow. Genome-wide data allowed us to establish that 28 of the 57 sampled individuals had at least a third-degree relative amongst other samples from the same river, often taken at different times. Taking advantage of a sampled family quartet, we estimated the de novo mutation rate in the platypus at 7.0×10−9/bp/generation (95% CI 4.1×10−9 − 1.2×10−8/bp/generation). We estimated effective population sizes of ancestral populations and haplotype sharing between current groupings, and found evidence for bottlenecks and long-term population decline in multiple regions, and early divergence between populations in different regions. This study demonstrates the power of whole-genome sequencing for studying natural populations of an evolutionarily important species.


2018 ◽  
Author(s):  
Marijke Autenrieth ◽  
Stefanie Hartmann ◽  
Ljerka Lah ◽  
Anna Roos ◽  
Alice B. Dennis ◽  
...  

AbstractThe harbour porpoise (Phocoena phocoena) is a highly mobile cetacean found in waters across the Northern hemisphere. It occurs in coastal water and inhabits water basins that vary broadly in salinity, temperature, and food availability. These diverse habitats could drive differentiation among populations. Here we report the first harbour porpoise genome, assembled de novo from a Swedish Kattegat individual. The genome is one of the most complete cetacean genomes currently available, with a total size of 2.7 Gb and 50% of the total length found in just 34 scaffolds. Using the largest 122 scaffolds, we were able to validate a high level of homology to the chromosome-level genome assembly of the closest related species for which such resource was available, the domestic cattle (Bos taurus). The draft annotation comprises 22,154 predicted gene models, which we further annotated through matches to the NCBI nucleotide database, GO categorization, and motif prediction. To infer the adaptive abilities of this species, as well as their population history, we performed a Bayesian skyline analysis, and produced results that are concordant with the demographic history of this species, including expansion and fragmentation events. Overall, this genome assembly, together with the draft annotation, represents a crucial addition to the limited genetic markers currently available for the study of porpoises and Phocoenidae conservation, phylogeny, and evolution.


2021 ◽  
Author(s):  
Daniel J Balick ◽  
Daniel M Jordan ◽  
Shamil Sunyaev ◽  
Ron Do

The identification of genes that evolve under recessive natural selection is a longstanding goal of population genetics research with important applications to disease gene discovery. We found that commonly used methods to evaluate selective constraint at the gene level are highly sensitive to genes under heterozygous selection but ubiquitously fail to detect recessively evolving genes. Additionally, more sophisticated likelihood-based methods designed to detect recessivity similarly lack power for a human gene of realistic length from current population sample sizes. However, extensive simulations suggested that recessive genes may be detectable in aggregate. Here, we offer a method informed by population genetics simulations designed to detect recessive purifying selection in gene sets. Applying this to empirical gene sets produced significant enrichments for strong recessive selection in genes previously inferred to be under recessive selection in a consanguineous cohort and in genes involved in autosomal recessive monogenic disorders.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
César Augusto Diniz Xavier ◽  
Margaret Louise Allen ◽  
Anna Elizabeth Whitfield

Abstract Background Advances in sequencing and analysis tools have facilitated discovery of many new viruses from invertebrates, including ants. Solenopsis invicta is an invasive ant that has quickly spread worldwide causing significant ecological and economic impacts. Its virome has begun to be characterized pertaining to potential use of viruses as natural enemies. Although the S. invicta virome is the best characterized among ants, most studies have been performed in its native range, with less information from invaded areas. Methods Using a metatranscriptome approach, we further identified and molecularly characterized virus sequences associated with S. invicta, in two introduced areas, U.S and Taiwan. The data set used here was obtained from different stages (larvae, pupa, and adults) of S. invicta life cycle. Publicly available RNA sequences from GenBank’s Sequence Read Archive were downloaded and de novo assembled using CLC Genomics Workbench 20.0.1. Contigs were compared against the non-redundant protein sequences and those showing similarity to viral sequences were further analyzed. Results We characterized five putative new viruses associated with S. invicta transcriptomes. Sequence comparisons revealed extensive divergence across ORFs and genomic regions with most of them sharing less than 40% amino acid identity with those closest homologous sequences previously characterized. The first negative-sense single-stranded RNA virus genomic sequences included in the orders Bunyavirales and Mononegavirales are reported. In addition, two positive single-strand virus genome sequences and one single strand DNA virus genome sequence were also identified. While the presence of a putative tenuivirus associated with S. invicta was previously suggested to be a contamination, here we characterized and present strong evidence that Solenopsis invicta virus 14 (SINV-14) is a tenui-like virus that has a long-term association with the ant. Furthermore, based on virus sequence abundance compared to housekeeping genes, phylogenetic relationships, and completeness of viral coding sequences, our results suggest that four of five virus sequences reported, those being SINV-14, SINV-15, SINV-16 and SINV-17, may be associated to viruses actively replicating in the ant S. invicta. Conclusions The present study expands our knowledge about viral diversity associated with S. invicta in introduced areas with potential to be used as biological control agents, which will require further biological characterization.


Genetics ◽  
2000 ◽  
Vol 155 (3) ◽  
pp. 1429-1437
Author(s):  
Oliver G Pybus ◽  
Andrew Rambaut ◽  
Paul H Harvey

Abstract We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.


2020 ◽  
Vol 72 (3) ◽  
pp. 731-747
Author(s):  
Russell Thomson ◽  
Prema-Chandra Athukorala

Abstract Do production capabilities of countries evolve from existing capabilities or emerge de novo? The Product Space approach developed by Hidalgo, Klinger, Barabási and Hausmann postulates that a country’s existing industrial structure largely determines its opportunities for industrial upgrading. However, this is difficult to reconcile with the export dynamism of many developing countries such as Thailand, Malaysia, Costa Rica and Vietnam that transformed from primary commodity dependence to exporters of dynamic manufactured products. In each of these cases, global production sharing facilitated industrial transition. In this article, we advance the Product Space approach to accommodate the role of global production sharing. Using a newly constructed multi-country data set of manufacturing exports that distinguishes between trade within global production networks and traditional horizontal trade, we find that that existing industrial structure has a smaller impact, but trade openness has a greater impact, on industrial upgrading within vertically integrated global industries.


2020 ◽  
Vol 12 (6) ◽  
pp. 905-910 ◽  
Author(s):  
Ruoyu Liu ◽  
Kun Wang ◽  
Jun Liu ◽  
Wenjie Xu ◽  
Yang Zhou ◽  
...  

Abstract Cold seeps, characterized by the methane, hydrogen sulfide, and other hydrocarbon chemicals, foster one of the most widespread chemosynthetic ecosystems in deep sea that are densely populated by specialized benthos. However, scarce genomic resources severely limit our knowledge about the origin and adaptation of life in this unique ecosystem. Here, we present a genome of a deep-sea limpet Bathyacmaea lactea, a common species associated with the dominant mussel beds in cold seeps. We yielded 54.6 gigabases (Gb) of Nanopore reads and 77.9-Gb BGI-seq raw reads, respectively. Assembly harvested a 754.3-Mb genome for B. lactea, with 3,720 contigs and a contig N50 of 1.57 Mb, covering 94.3% of metazoan Benchmarking Universal Single-Copy Orthologs. In total, 23,574 protein-coding genes and 463.4 Mb of repetitive elements were identified. We analyzed the phylogenetic position, substitution rate, demographic history, and TE activity of B. lactea. We also identified 80 expanded gene families and 87 rapidly evolving Gene Ontology categories in the B. lactea genome. Many of these genes were associated with heterocyclic compound metabolism, membrane-bounded organelle, metal ion binding, and nitrogen and phosphorus metabolism. The high-quality assembly and in-depth characterization suggest the B. lactea genome will serve as an essential resource for understanding the origin and adaptation of life in the cold seeps.


2010 ◽  
Vol 60 (4) ◽  
pp. 449-465
Author(s):  
Wen Longying ◽  
Zhang Lixun ◽  
An Bei ◽  
Luo Huaxing ◽  
Liu Naifa ◽  
...  

AbstractWe have used phylogeographic methods to investigate the genetic structure and population history of the endangered Himalayan snowcock (Tetraogallus himalayensis) in northwestern China. The mitochondrial cytochrome b gene was sequenced of 102 individuals sampled throughout the distribution range. In total, we found 26 different haplotypes defined by 28 polymorphic sites. Phylogenetic analyses indicated that the samples were divided into two major haplogroups corresponding to one western and one eastern clade. The divergence time between these major clades was estimated to be approximately one million years. An analysis of molecular variance showed that 40% of the total genetic variability was found within local populations, 12% among populations within regional groups and 48% among groups. An analysis of the demographic history of the populations suggested that major expansions have occurred in the Himalayan snowcock populations and these correlate mainly with the first and the second largest glaciations during the Pleistocene. In addition, the data indicate that there was a population expansion of the Tianshan population during the uplift of the Qinghai-Tibet Plateau, approximately 2 million years ago.


Sign in / Sign up

Export Citation Format

Share Document