scholarly journals Soil bacterial populations are shaped by recombination and gene-specific selection across a meadow

2019 ◽  
Author(s):  
Alexander Crits-Christoph ◽  
Matthew Olm ◽  
Spencer Diamond ◽  
Keith Bouma-Gregson ◽  
Jillian Banfield

AbstractSoil microbial diversity is often studied from the perspective of community composition, but less is known about genetic heterogeneity within species and how population structures are affected by dispersal, recombination, and selection. Genomic inferences about population structure can be made using the millions of sequencing reads that are assembled de novo into consensus genomes from metagenomes, as each read pair describes a short genomic sequence from a cell in the population. Here we track genome-wide population genetic variation for 19 highly abundant bacterial species sampled from across a grassland meadow. Genomic nucleotide identity of assembled genomes was significantly associated with local geography for half of the populations studied, and for a majority of populations within-sample nucleotide diversity could often be as high as meadow-wide nucleotide diversity. Genes involved in specialized metabolite biosynthesis and extracellular transport were characterized by elevated genetic diversity in multiple species. Microbial populations displayed varying degrees of homologous recombination and recombinant variants were often detected at 7-36% of loci genome-wide. Within multiple populations we identified genes with unusually high site-specific differentiation of alleles, fewer recombinant events, and lower nucleotide diversity, suggesting recent selective sweeps for gene variants. Taken together, these results indicate that recombination and gene-specific selection commonly shape local soil bacterial genetic variation.

2020 ◽  
Author(s):  
Lei Li ◽  
Yanjie Chao

ABSTRACTSmall proteins shorter than 50 amino acids have been long overlooked. A number of small proteins have been identified in several model bacteria using experimental approaches and assigned important functions in diverse cellular processes. The recent development of ribosome profiling technologies has allowed a genome-wide identification of small proteins and small ORFs (smORFs), but our incomplete understanding of small proteins hinders de novo computational prediction of smORFs in non-model bacterial species. Here, we have identified several sequence features for smORFs by a systematic analysis of all the known small proteins in E. coli, among which the translation initiation rate is the strongest determinant. By integrating these features into a support vector machine learning model, we have developed a novel sPepFinder algorithm that can predict conserved smORFs in bacterial genomes with a high accuracy of 92.8%. De novo prediction in E. coli has revealed several novel smORFs with evidence of translation supported by ribosome profiling. Further application of sPepFinder in 549 bacterial species has led to the identification of > 100,000 novel smORFs, many of which are conserved at the amino acid and nucleotide levels under purifying selection. Overall, we have established sPepFinder as a valuable tool to identify novel smORFs in both model and non-model bacterial organisms, and provided a large resource of small proteins for functional characterizations.


2017 ◽  
Author(s):  
Erik Lavington ◽  
Andrew D. Kern

AbstractChromosomal inversions are an ubiquitous feature of genetic variation. Theoretical models describe several mechanisms by which inversions can drive adaptation and be maintained as polymorphisms. While inversions have been shown previously to be under selection, or contain genetic variation under selection, the specific phenotypic consequences of inversions leading to their maintenance remain unclear. Here we use genomic sequence and expression data from the Drosophila Genetic Reference Panel to explore the effects of two cosmopolitan inversions, In(2L)t and In(3R)Mo, on patterns of transcriptional variation. We demonstrate that each inversion has a significant effect on transcript abundance for hundreds of genes across the genome. Inversion affected loci (IAL) appear both within inversions as well as on unlinked chromosomes. Importantly, IAL do not appear to be influenced by the previously reported genome-wide expression correlation structure. We found that five genes involved with sterol uptake, four of which are Niemann-Pick Type 2 orthologs, are upregulated in flies with In(3R)Mo but do not have SNPs in LD with the inversion. We speculate that this upregulation is driven by genetic variation in mod(mdg4) that is in LD with In(3R)Mo. We find that there is little evidence for regional or position effect of inversions on gene expression at the chromosomal level but do find evidence for the distal breakpoint of In(3R)Mo interrupting one gene and possibly disassociating the two flanking genes from regulatory elements.


2019 ◽  
Author(s):  
Michael D. Kessler ◽  
Douglas P. Loesch ◽  
James A. Perry ◽  
Nancy L. Heard-Costa ◽  
Brian E. Cade ◽  
...  

Abstractde novo Mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) program, we directly estimate and analyze DNM counts, rates, and spectra from 1,465 trios across an array of diverse human populations. Using the resulting call set of 86,865 single nucleotide DNMs, we find a significant positive correlation between local recombination rate and local DNM rate, which together can explain up to 35.5% of the genome-wide variation in population level rare genetic variation from 41K unrelated TOPMed samples. While genome-wide heterozygosity does correlate weakly with DNM count, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, interestingly, we do find significantly fewer DNMs in Amish individuals compared with other Europeans, even after accounting for parental age and sequencing center. Specifically, we find significant reductions in the number of T→C mutations in the Amish, which seems to underpin their overall reduction in DNMs. Finally, we calculate near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by non-additive genetic effects and/or the environment, and that a less mutagenic environment may be responsible for the reduced DNM rate in the Amish.SignificanceHere we provide one of the largest and most diverse human de novo mutation (DNM) call sets to date, and use it to quantify the genome-wide relationship between local mutation rate and population-level rare genetic variation. While we demonstrate that the human single nucleotide mutation rate is similar across numerous human ancestries and populations, we also discover a reduced mutation rate in the Amish founder population, which shows that mutation rates can shift rapidly. Finally, we find that variation in mutation rates is not heritable, which suggests that the environment may influence mutation rates more significantly than previously realized.


2019 ◽  
Author(s):  
Inken Wohlers ◽  
Axel Künstner ◽  
Matthias Munz ◽  
Michael Olbrich ◽  
Anke Fähnrich ◽  
...  

AbstractThe human genome is composed of chromosomal DNA sequences consisting of bases A, C, G and T – the blueprint to implement the molecular functions that are the basis of every individual’s life. Deciphering the first human genome was a consortium effort that took more than a decade and considerable cost. With the latest technological advances, determining an individual’s entire personal genome with manageable cost and effort has come within reach. Although the benefits of the all-encompassing genetic information that entire genomes provide are manifold, only a small number of de novo assembled human genomes have been reported to date 1–3, and few have been complemented with population-based genetic variation 4, which is particularly important for North Africans who are not represented in current genome-wide data sets 5–7. Here, we combine long- and short-read whole-genome next-generation sequencing data with recent assembly approaches into the first de novo assembly of the genome of an Egyptian individual. The resulting assembly demonstrates well-balanced quality metrics and is complemented with high-quality variant phasing via linked reads into haploblocks, which we can associate with gene expression changes in blood. To construct an Egyptian genome reference, we further assayed genome-wide genetic variation occurring in the Egyptian population within a representative cohort of 110 Egyptian individuals. We show that differences in allele frequencies and linkage disequilibrium between Egyptians and Europeans may compromise the transferability of European ancestry-based genetic disease risk and polygenic scores, substantiating the need for multi-ethnic genetic studies and corresponding genome references. The Egyptian genome reference represents a comprehensive population data set based on a high-quality personal genome. It is a proof of concept to be considered by the many national and international genome initiatives underway. More importantly, we anticipate that the Egyptian genome reference will be a valuable resource for precision medicine targeting the Egyptian population and beyond.


2021 ◽  
Vol 12 ◽  
Author(s):  
Junke Wang ◽  
Alyssa I. Clay-Gilmour ◽  
Ezgi Karaesmen ◽  
Abbas Rizvi ◽  
Qianqian Zhu ◽  
...  

The role of common genetic variation in susceptibility to acute myeloid leukemia (AML), and myelodysplastic syndrome (MDS), a group of rare clonal hematologic disorders characterized by dysplastic hematopoiesis and high mortality, remains unclear. We performed AML and MDS genome-wide association studies (GWAS) in the DISCOVeRY-BMT cohorts (2,309 cases and 2,814 controls). Association analysis based on subsets (ASSET) was used to conduct a summary statistics SNP-based analysis of MDS and AML subtypes. For each AML and MDS case and control we used PrediXcan to estimate the component of gene expression determined by their genetic profile and correlate this imputed gene expression level with risk of developing disease in a transcriptome-wide association study (TWAS). ASSET identified an increased risk for de novo AML and MDS (OR = 1.38, 95% CI, 1.26-1.51, Pmeta = 2.8 × 10–12) in patients carrying the T allele at s12203592 in Interferon Regulatory Factor 4 (IRF4), a transcription factor which regulates myeloid and lymphoid hematopoietic differentiation. Our TWAS analyses showed increased IRF4 gene expression is associated with increased risk of de novo AML and MDS (OR = 3.90, 95% CI, 2.36-6.44, Pmeta = 1.0 × 10–7). The identification of IRF4 by both GWAS and TWAS contributes valuable insight on the role of genetic variation in AML and MDS susceptibility.


2019 ◽  
Author(s):  
Junke Wang ◽  
Alyssa I. Clay-Gilmour ◽  
Ezgi Karaesmen ◽  
Abbas Rizvi ◽  
Qianqian Zhu ◽  
...  

ABSTRACTThe role of common genetic variation in susceptibility to acute myeloid leukemia (AML), and myelodysplastic syndrome (MDS), a group of rare clonal hematologic disorders characterized by dysplastic hematopoiesis and high mortality, remains unclear. We performed AML and MDS genome-wide association studies (GWAS) in the DISCOVeRY-BMT cohorts (2309 cases and 2814 controls). Association analysis based on subsets (ASSET) was used to conduct a summary statistics SNP-based analysis of MDS and AML subtypes. For each AML and MDS case and control we used PrediXcan to estimate the component of gene expression determined by their genetic profile and correlate this imputed gene expression level with risk of developing disease in a transcriptome-wide association study (TWAS). ASSET identified an increased risk for de novo AML and MDS (OR=1.38, 95% CI, 1.26-1.51, Pmeta=2.8×10-12) in patients carrying the T allele at rs12203592 in Interferon Regulatory Factor 4 (IRF4), a transcription factor which regulates myeloid and lymphoid hematopoietic differentiation. Our TWAS analyses showed increased IRF4 gene expression is associated with increased risk of de novo AML and MDS (OR=3.90, 95% CI, 2.36-6.44, Pmeta =1.0×10-7). The identification of IRF4 by both GWAS and TWAS contributes valuable insight on the role of genetic variation in AML and MDS susceptibility.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Inken Wohlers ◽  
Axel Künstner ◽  
Matthias Munz ◽  
Michael Olbrich ◽  
Anke Fähnrich ◽  
...  

Abstract A small number of de novo assembled human genomes have been reported to date, and few have been complemented with population-based genetic variation, which is particularly important for North Africa, a region underrepresented in current genome-wide references. Here, we combine long- and short-read whole-genome sequencing data with recent assembly approaches into a de novo assembly of an Egyptian genome. The assembly demonstrates well-balanced quality metrics and is complemented with variant phasing via linked reads into haploblocks, which we associate with gene expression changes in blood. To construct an Egyptian genome reference, we identify genome-wide genetic variation within a cohort of 110 Egyptian individuals. We show that differences in allele frequencies and linkage disequilibrium between Egyptians and Europeans may compromise the transferability of European ancestry-based genetic disease risk and polygenic scores, substantiating the need for multi-ethnic genome references. Thus, the Egyptian genome reference will be a valuable resource for precision medicine.


BMC Genomics ◽  
2015 ◽  
Vol 16 (Suppl 7) ◽  
pp. S13 ◽  
Author(s):  
Yizhe Zhang ◽  
Yupeng He ◽  
Guangyong Zheng ◽  
Chaochun Wei

2020 ◽  
Vol 117 (5) ◽  
pp. 2560-2569 ◽  
Author(s):  
Michael D. Kessler ◽  
Douglas P. Loesch ◽  
James A. Perry ◽  
Nancy L. Heard-Costa ◽  
Daniel Taliun ◽  
...  

De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of C→A and T→C mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.


2021 ◽  
Vol 288 (1944) ◽  
pp. 20203094
Author(s):  
David Berger ◽  
Josefine Stångberg ◽  
Julian Baur ◽  
Richard J. Walters

Adaptation in new environments depends on the amount of genetic variation available for evolution, and the efficacy by which natural selection discriminates among this variation. However, whether some ecological factors reveal more genetic variation, or impose stronger selection pressures than others, is typically not known. Here, we apply the enzyme kinetic theory to show that rising global temperatures are predicted to intensify natural selection throughout the genome by increasing the effects of DNA sequence variation on protein stability. We test this prediction by (i) estimating temperature-dependent fitness effects of induced mutations in seed beetles adapted to ancestral or elevated temperature, and (ii) calculate 100 paired selection estimates on mutations in benign versus stressful environments from unicellular and multicellular organisms. Environmental stress per se did not increase mean selection on de novo mutation, suggesting that the cost of adaptation does not generally increase in new ecological settings to which the organism is maladapted. However, elevated temperature increased the mean strength of selection on genome-wide polymorphism, signified by increases in both mutation load and mutational variance in fitness. These results have important implications for genetic diversity gradients and the rate and repeatability of evolution under climate change.


Sign in / Sign up

Export Citation Format

Share Document