scholarly journals Average genome size estimation enables accurate quantification of gene family abundance and sheds light on the functional ecology of the human microbiome

2014 ◽  
Author(s):  
Stephen Nayfach ◽  
Katherine S Pollard

Average genome size (AGS) is an important, yet often overlooked property of microbial communities. We developed MicrobeCensus to rapidly and accurately estimate AGS from short-read metagenomics data and applied our tool to over 1,300 human microbiome samples. We found that AGS differs significantly within and between body sites and tracks with major functional and taxonomic differences. For example, in the gut, AGS ranges from 2.5 to 5.8 megabases and is positively correlated with the abundance of Bacteroides and polysaccharide metabolism. Furthermore, we found that AGS variation can bias comparative analyses, and that normalization improves detection of differentially abundant genes.

2019 ◽  
Author(s):  
Boas Pucker

AbstractWhile the size of chromosomes can be measured under a microscope, the size of genomes cannot be measured precisely. Biochemical methods and k-mer distribution-based approaches allow only estimations. An alternative approach to predict the genome size based on high contiguity assemblies and short read mappings is presented here and optimized onArabidopsis thalianaandBeta vulgaris.Brachypodium distachyon,Solanum lycopersicum,Vitis vinifera, andZea mayswere also analyzed to demonstrate the broad applicability of this approach. Mapping-based Genome Size Estimation (MGSE) and additional scripts are available on github:https://github.com/bpucker/MGSE.


2020 ◽  
Author(s):  
Johan Nyström-Persson ◽  
Gabriel Keeble-Gagnère ◽  
Niamat Zawad

AbstractThe processing of k-mers (subsequences of length k) is at the foundation of many sequence processing algorithms in bioinformatics, including k-mer counting for genome size estimation, genome assembly, and taxonomic classification for metagenomics. Minimizers - ordered m-mers where m < k - are often used to group k-mers into bins as a first step in such processing. However, minimizers are known to generate bins of very different sizes, which can pose challenges for distributed and parallel processing, as well as generally increase memory requirements. Furthermore, although various minimizer orderings have been proposed, their practical value for improving tool efficiency has not yet been fully explored. Here we present Discount, a distributed k-mer counting tool based on Apache Spark, which we use to investigate the behaviour of various minimizer orderings in practice when applied to metagenomics data. Using this tool, we then introduce the universal frequency ordering, a new combination of frequency counted minimizers and universal k-mer hitting sets, which yields both evenly distributed binning and small bin sizes. We show that this ordering allows Discount to perform distributed k-mer counting on a large dataset in as little as 1/8 of the memory of comparable approaches, making it the most efficient out-of-core distributed k-mer counting method available.


Genes ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 563
Author(s):  
Monika Rewers ◽  
Iwona Jedrzejczyk ◽  
Agnieszka Rewicz ◽  
Anna Jakubska-Busse

Orchidaceae is one of the largest and the most widespread plant families with many species threatened with extinction. However, only about 1.5% of orchids’ genome sizes have been known so far. The aim of this study was to estimate the genome size of 15 species and one infraspecific taxon of endangered and protected orchids growing wild in Poland to assess their variability and develop additional criterion useful in orchid species identification and characterization. Flow cytometric genome size estimation revealed that investigated orchid species possessed intermediate, large, and very large genomes. The smallest 2C DNA content possessed Liparis loeselii (14.15 pg), while the largest Cypripedium calceolus (82.10 pg). It was confirmed that the genome size is characteristic to the subfamily. Additionally, for four species Epipactis albensis, Ophrys insectifera, Orchis mascula, Orchis militaris and one infraspecific taxon, Epipactis purpurata f. chlorophylla the 2C DNA content has been estimated for the first time. Genome size estimation by flow cytometry proved to be a useful auxiliary method for quick orchid species identification and characterization.


Data ◽  
2021 ◽  
Vol 6 (5) ◽  
pp. 44
Author(s):  
Jae-Hyun Lim ◽  
Il-Nam Kim

Marine bacteria are known to play significant roles in marine biogeochemical cycles regarding the decomposition of organic matter. Despite the increasing attention paid to the study of marine bacteria, research has been too limited to fully elucidate the complex interaction between marine bacterial communities and environmental variables. Jinhae Bay, the study area in this work, is the most anthropogenically eutrophied coastal bay in South Korea, and while its physical and biogeochemical characteristics are well described, less is known about the associated changes in microbial communities. In the present study, we reconstructed a metagenomics data based on the 16S rRNA gene to investigate temporal and vertical changes in microbial communities at three depths (surface, middle, and bottom) during a seven-month period from June to December 2016 at one sampling site (J1) in Jinhae Bay. Of all the bacterial data, Proteobacteria, Bacteroidetes, and Cyanobacteria were predominant from June to November, whereas Firmicutes were predominant in December, especially at the middle and bottom depths. These results show that the composition of the microbial community is strongly associated with temporal changes. Furthermore, the community compositions were markedly different between the surface, middle, and bottom depths in summer, when water column stratification and bottom water hypoxia (low dissolved oxygen level) were strongly developed. Metagenomics data contribute to improving our understanding of important relationships between environmental characteristics and microbial community change in eutrophication-induced and deoxygenated coastal areas.


2014 ◽  
Vol 92 (10) ◽  
pp. 847-851 ◽  
Author(s):  
Kelly L. Mulligan ◽  
Terra C. Hiebert ◽  
Nicholas W. Jeffery ◽  
T. Ryan Gregory

Ribbon worms (phylum Nemertea) are among several animal groups that have been overlooked in past studies of genome-size diversity. Here, we report genome-size estimates for eight species of nemerteans, including representatives of the major lineages in the phylum. Genome sizes in these species ranged more than fivefold, and there was some indication of a positive relationship with body size. Somatic endopolyploidy also appears to be common in these animals. Importantly, this study demonstrates that both of the most common methods of genome-size estimation (flow cytometry and Feulgen image analysis densitometry) can be used to assess genome size in ribbon worms, thereby facilitating additional efforts to investigate patterns of variability in nuclear DNA content in this phylum.


2021 ◽  
Author(s):  
Jani Angel J. Raymond ◽  
Mudagandur Shashi Shekhar ◽  
Vinaya Kumar Katneni ◽  
Ashok Kumar Jangham ◽  
Sudheesh Kommu Prabhudas ◽  
...  

2012 ◽  
Vol 78 (15) ◽  
pp. 5288-5296 ◽  
Author(s):  
Yu-Wei Wu ◽  
Mina Rho ◽  
Thomas G. Doak ◽  
Yuzhen Ye

ABSTRACTThe NIH Human Microbiome Project (HMP) has produced several hundred metagenomic data sets, allowing studies of the many functional elements in human-associated microbial communities. Here, we survey the distribution of oral spirochetes implicated in dental diseases in normal human individuals, using recombination sites associated with the chromosomal integron inTreponemagenomes, taking advantage of the multiple copies of the integron recombination sites (repeats) in the genomes, and using a targeted assembly approach that we have developed. We find that integron-containingTreponemaspecies are present in ∼80% of the normal human subjects included in the HMP. Further, we are able tode novoassemble the integron gene cassettes using our constrained assembly approach, which employs a unique application of the de Bruijn graph assembly information; most of these cassette genes were not assembled in whole-metagenome assemblies and could not be identified by mapping sequencing reads onto the known referenceTreponemagenomes due to the dynamic nature of integron gene cassettes. Our study significantly enriches the gene pool known to be carried byTreponemachromosomal integrons, totaling 826 (598 97% nonredundant) genes. We characterize the functions of these gene cassettes: many of these genes have unknown functions. The integron gene cassette arrays found in the human microbiome are extraordinarily dynamic, with different microbial communities sharing only a small number of common genes.


2018 ◽  
Vol 7 (4) ◽  
pp. 38 ◽  
Author(s):  
Valeria D’Argenio

The last few years have featured an increasing interest in the study of the human microbiome and its correlations with health status. Indeed, technological advances have allowed the study of microbial communities to reach a previously unthinkable sensitivity, showing the presence of microbes also in environments usually considered as sterile. In this scenario, microbial communities have been described in the amniotic fluid, the umbilical blood cord, and the placenta, denying a dogma of reproductive medicine that considers the uterus like a sterile womb. This prenatal microbiome may play a role not only in fetal development but also in the predisposition to diseases that may develop later in life, and also in adulthood. Thus, the aim of this review is to report the current knowledge regarding the prenatal microbiome composition, its association with pathological processes, and the future perspectives regarding its manipulation for healthy status promotion and maintenance.


Sign in / Sign up

Export Citation Format

Share Document