Average genome size estimation enables accurate quantification of gene family abundance and sheds light on the functional ecology of the human microbiome

Mapping Intimacies ◽

10.1101/009001 ◽

2014 ◽

Cited By ~ 1

Author(s):

Stephen Nayfach ◽

Katherine S Pollard

Keyword(s):

Gene Family ◽

Microbial Communities ◽

Genome Size ◽

Human Microbiome ◽

Functional Ecology ◽

Size Estimation ◽

Short Read ◽

Comparative Analyses ◽

Metagenomics Data ◽

Accurate Quantification

Average genome size (AGS) is an important, yet often overlooked property of microbial communities. We developed MicrobeCensus to rapidly and accurately estimate AGS from short-read metagenomics data and applied our tool to over 1,300 human microbiome samples. We found that AGS differs significantly within and between body sites and tracks with major functional and taxonomic differences. For example, in the gut, AGS ranges from 2.5 to 5.8 megabases and is positively correlated with the abundance of Bacteroides and polysaccharide metabolism. Furthermore, we found that AGS variation can bias comparative analyses, and that normalization improves detection of differentially abundant genes.

Download Full-text

Average genome size estimation improves comparative metagenomics and sheds light on the functional ecology of the human microbiome

Genome Biology ◽

10.1186/s13059-015-0611-7 ◽

2015 ◽

Vol 16 (1) ◽

Cited By ~ 98

Author(s):

Stephen Nayfach ◽

Katherine S Pollard

Keyword(s):

Genome Size ◽

Human Microbiome ◽

Functional Ecology ◽

Size Estimation ◽

Comparative Metagenomics

Download Full-text

Mapping-based genome size estimation

10.1101/607390 ◽

2019 ◽

Cited By ~ 2

Author(s):

Boas Pucker

Keyword(s):

Arabidopsis Thaliana ◽

Zea Mays ◽

Vitis Vinifera ◽

Genome Size ◽

Solanum Lycopersicum ◽

Beta Vulgaris ◽

Brachypodium Distachyon ◽

Size Estimation ◽

Short Read ◽

Alternative Approach

AbstractWhile the size of chromosomes can be measured under a microscope, the size of genomes cannot be measured precisely. Biochemical methods and k-mer distribution-based approaches allow only estimations. An alternative approach to predict the genome size based on high contiguity assemblies and short read mappings is presented here and optimized onArabidopsis thalianaandBeta vulgaris.Brachypodium distachyon,Solanum lycopersicum,Vitis vinifera, andZea mayswere also analyzed to demonstrate the broad applicability of this approach. Mapping-based Genome Size Estimation (MGSE) and additional scripts are available on github:https://github.com/bpucker/MGSE.

Download Full-text

Compact and evenly distributed k-mer binning for genomic sequences

10.1101/2020.10.12.335364 ◽

2020 ◽

Author(s):

Johan Nyström-Persson ◽

Gabriel Keeble-Gagnère ◽

Niamat Zawad

Keyword(s):

Parallel Processing ◽

Genome Size ◽

Genome Assembly ◽

Taxonomic Classification ◽

New Combination ◽

Size Estimation ◽

Counting Method ◽

Large Dataset ◽

Metagenomics Data ◽

Processing Algorithms

AbstractThe processing of k-mers (subsequences of length k) is at the foundation of many sequence processing algorithms in bioinformatics, including k-mer counting for genome size estimation, genome assembly, and taxonomic classification for metagenomics. Minimizers - ordered m-mers where m < k - are often used to group k-mers into bins as a first step in such processing. However, minimizers are known to generate bins of very different sizes, which can pose challenges for distributed and parallel processing, as well as generally increase memory requirements. Furthermore, although various minimizer orderings have been proposed, their practical value for improving tool efficiency has not yet been fully explored. Here we present Discount, a distributed k-mer counting tool based on Apache Spark, which we use to investigate the behaviour of various minimizer orderings in practice when applied to metagenomics data. Using this tool, we then introduce the universal frequency ordering, a new combination of frequency counted minimizers and universal k-mer hitting sets, which yields both evenly distributed binning and small bin sizes. We show that this ordering allows Discount to perform distributed k-mer counting on a large dataset in as little as 1/8 of the memory of comparable approaches, making it the most efficient out-of-core distributed k-mer counting method available.

Download Full-text

Genome Size Diversity in Rare, Endangered, and Protected Orchids in Poland

Genes ◽

10.3390/genes12040563 ◽

2021 ◽

Vol 12 (4) ◽

pp. 563

Author(s):

Monika Rewers ◽

Iwona Jedrzejczyk ◽

Agnieszka Rewicz ◽

Anna Jakubska-Busse

Keyword(s):

Genome Size ◽

Species Identification ◽

Dna Content ◽

Size Estimation ◽

Orchid Species ◽

Infraspecific Taxon ◽

2C Dna Content ◽

Plant Families ◽

Liparis Loeselii ◽

Identification And Characterization

Orchidaceae is one of the largest and the most widespread plant families with many species threatened with extinction. However, only about 1.5% of orchids’ genome sizes have been known so far. The aim of this study was to estimate the genome size of 15 species and one infraspecific taxon of endangered and protected orchids growing wild in Poland to assess their variability and develop additional criterion useful in orchid species identification and characterization. Flow cytometric genome size estimation revealed that investigated orchid species possessed intermediate, large, and very large genomes. The smallest 2C DNA content possessed Liparis loeselii (14.15 pg), while the largest Cypripedium calceolus (82.10 pg). It was confirmed that the genome size is characteristic to the subfamily. Additionally, for four species Epipactis albensis, Ophrys insectifera, Orchis mascula, Orchis militaris and one infraspecific taxon, Epipactis purpurata f. chlorophylla the 2C DNA content has been estimated for the first time. Genome size estimation by flow cytometry proved to be a useful auxiliary method for quick orchid species identification and characterization.

Download Full-text

Collection of a Bacterial Community Reconstructed from Marine Metagenomes Derived from Jinhae Bay, South Korea

Data ◽

10.3390/data6050044 ◽

2021 ◽

Vol 6 (5) ◽

pp. 44

Author(s):

Jae-Hyun Lim ◽

Il-Nam Kim

Keyword(s):

Microbial Community ◽

South Korea ◽

Microbial Communities ◽

Marine Bacteria ◽

Sampling Site ◽

Community Change ◽

Rrna Gene ◽

Water Column Stratification ◽

Metagenomics Data ◽

Jinhae Bay

Marine bacteria are known to play significant roles in marine biogeochemical cycles regarding the decomposition of organic matter. Despite the increasing attention paid to the study of marine bacteria, research has been too limited to fully elucidate the complex interaction between marine bacterial communities and environmental variables. Jinhae Bay, the study area in this work, is the most anthropogenically eutrophied coastal bay in South Korea, and while its physical and biogeochemical characteristics are well described, less is known about the associated changes in microbial communities. In the present study, we reconstructed a metagenomics data based on the 16S rRNA gene to investigate temporal and vertical changes in microbial communities at three depths (surface, middle, and bottom) during a seven-month period from June to December 2016 at one sampling site (J1) in Jinhae Bay. Of all the bacterial data, Proteobacteria, Bacteroidetes, and Cyanobacteria were predominant from June to November, whereas Firmicutes were predominant in December, especially at the middle and bottom depths. These results show that the composition of the microbial community is strongly associated with temporal changes. Furthermore, the community compositions were markedly different between the surface, middle, and bottom depths in summer, when water column stratification and bottom water hypoxia (low dissolved oxygen level) were strongly developed. Metagenomics data contribute to improving our understanding of important relationships between environmental characteristics and microbial community change in eutrophication-induced and deoxygenated coastal areas.

Download Full-text

First estimates of genome size in ribbon worms (phylum Nemertea) using flow cytometry and Feulgen image analysis densitometry

Canadian Journal of Zoology ◽

10.1139/cjz-2014-0068 ◽

2014 ◽

Vol 92 (10) ◽

pp. 847-851 ◽

Cited By ~ 3

Author(s):

Kelly L. Mulligan ◽

Terra C. Hiebert ◽

Nicholas W. Jeffery ◽

T. Ryan Gregory

Keyword(s):

Flow Cytometry ◽

Image Analysis ◽

Body Size ◽

Genome Size ◽

Positive Relationship ◽

Nuclear Dna ◽

Nuclear Dna Content ◽

Size Estimation ◽

Size Diversity ◽

Size Estimates

Ribbon worms (phylum Nemertea) are among several animal groups that have been overlooked in past studies of genome-size diversity. Here, we report genome-size estimates for eight species of nemerteans, including representatives of the major lineages in the phylum. Genome sizes in these species ranged more than fivefold, and there was some indication of a positive relationship with body size. Somatic endopolyploidy also appears to be common in these animals. Importantly, this study demonstrates that both of the most common methods of genome-size estimation (flow cytometry and Feulgen image analysis densitometry) can be used to assess genome size in ribbon worms, thereby facilitating additional efforts to investigate patterns of variability in nuclear DNA content in this phylum.

Download Full-text

GENOME SIZE ESTIMATION IN TWO POPULATIONS OF THE NORTHERN CHILEAN SCALLOP, ARGOPECTEN PURPURATUS, USING FLUORESCENCE IMAGE ANALYSIS

Journal of Shellfish Research ◽

10.2983/0730-8000(2005)24[55:gseitp]2.0.co;2 ◽

2005 ◽

Vol 24 (1) ◽

pp. 55-60 ◽

Cited By ~ 6

Keyword(s):

Image Analysis ◽

Genome Size ◽

Size Estimation ◽

Fluorescence Image ◽

Argopecten Purpuratus ◽

Two Populations

Download Full-text

Comparative genome size estimation of different life stages of grey mullet, Mugil cephalus Linnaeus, 1758 by flow cytometry

Aquaculture Research ◽

10.1111/are.15639 ◽

2021 ◽

Author(s):

Jani Angel J. Raymond ◽

Mudagandur Shashi Shekhar ◽

Vinaya Kumar Katneni ◽

Ashok Kumar Jangham ◽

Sudheesh Kommu Prabhudas ◽

...

Keyword(s):

Flow Cytometry ◽

Genome Size ◽

Mugil Cephalus ◽

Life Stages ◽

Size Estimation ◽

Grey Mullet ◽

Comparative Genome

Download Full-text

Oral Spirochetes Implicated in Dental Diseases Are Widespread in Normal Human Subjects and Carry Extremely Diverse Integron Gene Cassettes

Applied and Environmental Microbiology ◽

10.1128/aem.00564-12 ◽

2012 ◽

Vol 78 (15) ◽

pp. 5288-5296 ◽

Cited By ~ 17

Author(s):

Yu-Wei Wu ◽

Mina Rho ◽

Thomas G. Doak ◽

Yuzhen Ye

Keyword(s):

Microbial Communities ◽

De Novo ◽

Human Subjects ◽

Human Microbiome ◽

Metagenomic Data ◽

De Bruijn Graph ◽

Content Type ◽

Gene Cassettes ◽

Dental Diseases ◽

Normal Human

ABSTRACTThe NIH Human Microbiome Project (HMP) has produced several hundred metagenomic data sets, allowing studies of the many functional elements in human-associated microbial communities. Here, we survey the distribution of oral spirochetes implicated in dental diseases in normal human individuals, using recombination sites associated with the chromosomal integron inTreponemagenomes, taking advantage of the multiple copies of the integron recombination sites (repeats) in the genomes, and using a targeted assembly approach that we have developed. We find that integron-containingTreponemaspecies are present in ∼80% of the normal human subjects included in the HMP. Further, we are able tode novoassemble the integron gene cassettes using our constrained assembly approach, which employs a unique application of the de Bruijn graph assembly information; most of these cassette genes were not assembled in whole-metagenome assemblies and could not be identified by mapping sequencing reads onto the known referenceTreponemagenomes due to the dynamic nature of integron gene cassettes. Our study significantly enriches the gene pool known to be carried byTreponemachromosomal integrons, totaling 826 (598 97% nonredundant) genes. We characterize the functions of these gene cassettes: many of these genes have unknown functions. The integron gene cassette arrays found in the human microbiome are extraordinarily dynamic, with different microbial communities sharing only a small number of common genes.

Download Full-text

The Prenatal Microbiome: A New Player for Human Health

High-Throughput ◽

10.3390/ht7040038 ◽

2018 ◽

Vol 7 (4) ◽

pp. 38 ◽

Cited By ~ 11

Author(s):

Valeria D’Argenio

Keyword(s):

Health Status ◽

Microbial Communities ◽

Fetal Development ◽

Reproductive Medicine ◽

Current Knowledge ◽

Human Microbiome ◽

Future Perspectives ◽

Microbiome Composition ◽

Umbilical Blood ◽

Technological Advances

The last few years have featured an increasing interest in the study of the human microbiome and its correlations with health status. Indeed, technological advances have allowed the study of microbial communities to reach a previously unthinkable sensitivity, showing the presence of microbes also in environments usually considered as sterile. In this scenario, microbial communities have been described in the amniotic fluid, the umbilical blood cord, and the placenta, denying a dogma of reproductive medicine that considers the uterus like a sterile womb. This prenatal microbiome may play a role not only in fetal development but also in the predisposition to diseases that may develop later in life, and also in adulthood. Thus, the aim of this review is to report the current knowledge regarding the prenatal microbiome composition, its association with pathological processes, and the future perspectives regarding its manipulation for healthy status promotion and maintenance.

Download Full-text