scholarly journals MetaCompass: Reference-guided Assembly of Metagenomes

2017 ◽  
Author(s):  
Victoria Cepeda ◽  
Bo Liu ◽  
Mathieu Almeida ◽  
Christopher M. Hill ◽  
Sergey Koren ◽  
...  

ABSTRACTMetagenomic studies have primarily relied on de novo approaches for reconstructing genes and genomes from microbial mixtures. While database driven approaches have been employed in certain analyses, they have not been used in the assembly of metagenomes. Here we describe the first effective approach for reference-guided metagenomic assembly of low-abundance bacterial genomes that can complement and improve upon de novo metagenomic assembly methods. When combined with de novo assembly approaches, we show that MetaCompass can generate more complete assemblies than can be obtained by de novo assembly alone, and improve on assemblies from the Human Microbiome Project (over 2,000 samples).

2018 ◽  
Author(s):  
E. Whittle ◽  
M.O. Leonard ◽  
R. Harrison ◽  
T.W. Gant ◽  
D.P Tonge

AbstractThe term microbiome describes the genetic material encoding the various microbial populations that inhabit our body. Whilst colonisation of various body niches (e.g. the gut) by dynamic communities of microorganisms is now universally accepted, the existence of microbial populations in other “classically sterile” locations, including the blood, is a relatively new concept. The presence of bacteria-specific DNA in the blood has been reported in the literature for some time, yet the true origin of this is still the subject of much deliberation. The aim of this study was to investigate the phenomenon of a “blood microbiome” by providing a comprehensive description of bacterially-derived nucleic acids using a range of complementary molecular and classical microbiological techniques. For this purpose we utilised a set of plasma samples from healthy subjects (n = 5) and asthmatic subjects (n = 5). DNA-level analyses involved the amplification and sequencing of the 16S rRNA gene. RNA-level analyses were based upon thede novoassembly of unmapped mRNA reads and subsequent taxonomic identification. Molecular studies were complemented by viability data from classical aerobic and anaerobic microbial culture experiments. At the phylum level, the blood microbiome was predominated by Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. The key phyla detected were consistent irrespective of molecular method (DNA vs RNA), and consistent with the results of other published studies.In silicocomparison of our data with that of the Human Microbiome Project revealed that members of the blood microbiome were most likely to have originated from the oral or skin communities. To our surprise, aerobic and anaerobic cultures were positive in eight of out the ten donor samples investigated, and we reflect upon their source. Our data provide further evidence of a core blood microbiome, and provide insight into the potential source of the bacterial DNA / RNA detected in the blood. Further, data reveal the importance of robust experimental procedures, and identify areas for future consideration.


2019 ◽  
Author(s):  
Bruce A Rosa ◽  
Kathie Mihindukulasuriya ◽  
Kymberlie Hallsworth-Pepin ◽  
Aye Wollam ◽  
John Martin ◽  
...  

AbstractWhole genome bacterial sequences are required to better understand microbial functions, niches-pecific bacterial metabolism, and disease states. Although genomic sequences are available for many of the human-associated bacteria from commonly tested body habitats (e.g. stool), as few as 13% of bacterial-derived reads from other sites such as the skin map to known bacterial genomes. To facilitate a better characterization of metagenomic shotgun reads from under-represented body sites, we collected over 10,000 bacterial isolates originating from 14 human body habitats, identified novel taxonomic groups based on full length 16S rRNA sequences, clustered the sequences to ensure that no individual taxonomic group was over-selected for sequencing, prioritized bacteria from under-represented body sites (such as skin, respiratory and urinary tract), and sequenced and assembled genomes for 665 new bacterial strains. Here we show that addition of these genomes improved read mapping rates of HMP metagenomic samples by nearly 30% for the previously under-represented phylum Fusobacteria, and 27.5% of the novel genomes generated here had high representation in at least one of the tested HMP samples, compared to 12.5% of the sequences in the public databases, indicating an enrichment of useful novel genomic sequences resulting from the prioritization procedure. As our understanding of the human microbiome continues to improve and to enter the realm of therapy developments, targeted approaches such as this to improve genomic databases will increase in importance from both an academic and clinical perspective.ImportanceThe human microbiome plays a critically important role in health and disease, but current understanding of the mechanisms underlying the interactions between the varying microbiome and the different host environments is lacking. Having access to a database of fully sequenced bacterial genomes provides invaluable insights into microbial functions, but currently sequenced genomes for the human microbiome have largely come from a limited number of body sites (primarily stool), while other sites such as the skin, respiratory tract and urinary tracts are under-represented, resulting in as little as 13% of bacterial-derived reads mapping to known bacterial genomes. Here, we sequenced and assembled 665 new bacterial genomes, prioritized from a larger database to select under-represented body sites and bacterial taxa in the existing databases. As a result, we substantially improve mapping rates for samples from the Human Microbiome Project and provide an important contribution to human bacterial genomic databases for future studies.


2021 ◽  
Author(s):  
Yunxi Liu ◽  
R. A. Leo Elworth ◽  
Michael D. Jochum ◽  
Kjersti M. Aagaard ◽  
Todd J. Treangen

Computational analysis of host-associated microbiomes has opened the door to numerous discoveries relevant to human health and disease. However, contaminant sequences in metagenomic samples can potentially impact the interpretation of findings reported in microbiome studies, especially in low biomass environments. Our hypothesis is that contamination from DNA extraction kits or sampling lab environments will leave taxonomic bread crumbs across multiple distinct sample types, allowing for the detection of microbial contaminants when negative controls are unavailable. To test this hypothesis we implemented Squeegee, a de novo contamination detection tool. We tested Squeegee on simulated and real low biomass metagenomic datasets. On the low biomass samples, we compared Squeegee predictions to experimental negative control data and show that Squeegee accurately recovers known contaminants. We also analyzed 749 metagenomic datasets from the Human Microbiome Project and identified likely previously unreported kit contamination. Collectively, our results highlight that Squeegee can identify microbial contaminants with high precision. Squeegee is open-source and available at: https://gitlab.com/treangenlab/squeegee


mSystems ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Bruce A. Rosa ◽  
Kathie Mihindukulasuriya ◽  
Kymberlie Hallsworth-Pepin ◽  
Aye Wollam ◽  
John Martin ◽  
...  

ABSTRACT Whole-genome bacterial sequences are required to better understand microbial functions, niche-specific bacterial metabolism, and disease states. Although genomic sequences are available for many of the human-associated bacteria from commonly tested body habitats (e.g., feces), as few as 13% of bacterium-derived reads from other sites such as the skin map to known bacterial genomes. To facilitate a better characterization of metagenomic shotgun reads from underrepresented body sites, we collected over 10,000 bacterial isolates originating from 14 human body habitats, identified novel taxonomic groups based on full-length 16S rRNA gene sequences, clustered the sequences to ensure that no individual taxonomic group was overselected for sequencing, prioritized bacteria from underrepresented body sites (such as skin and respiratory and urinary tracts), and sequenced and assembled genomes for 665 new bacterial strains. Here, we show that addition of these genomes improved read mapping rates of Human Microbiome Project (HMP) metagenomic samples by nearly 30% for the previously underrepresented phylum Fusobacteria, and 27.5% of the novel genomes generated here had high representation in at least one of the tested HMP samples, compared to 12.5% of the sequences in the public databases, indicating an enrichment of useful novel genomic sequences resulting from the prioritization procedure. As our understanding of the human microbiome continues to improve and to enter the realm of therapy developments, targeted approaches such as this to improve genomic databases will increase in importance from both an academic and a clinical perspective. IMPORTANCE The human microbiome plays a critically important role in health and disease, but current understanding of the mechanisms underlying the interactions between the varying microbiome and the different host environments is lacking. Having access to a database of fully sequenced bacterial genomes provides invaluable insights into microbial functions, but currently sequenced genomes for the human microbiome have largely come from a limited number of body sites (primarily feces), while other sites such as the skin, respiratory tract, and urinary tract are underrepresented, resulting in as little as 13% of bacterium-derived reads mapping to known bacterial genomes. Here, we sequenced and assembled 665 new bacterial genomes, prioritized from a larger database to select underrepresented body sites and bacterial taxa in the existing databases. As a result, we substantially improve mapping rates for samples from the Human Microbiome Project and provide an important contribution to human bacterial genomic databases for future studies.


2016 ◽  
Vol 2 ◽  
pp. e94 ◽  
Author(s):  
Gaëtan Benoit ◽  
Pierre Peterlongo ◽  
Mahendra Mariadassou ◽  
Erwan Drezen ◽  
Sophie Schbath ◽  
...  

BackgroundLarge scale metagenomic projects aim to extract biodiversity knowledge between different environmental conditions. Current methods for comparing microbial communities face important limitations. Those based on taxonomical or functional assignation rely on a small subset of the sequences that can be associated to known organisms. On the other hand,de novomethods, that compare the whole sets of sequences, either do not scale up on ambitious metagenomic projects or do not provide precise and exhaustive results.MethodsThese limitations motivated the development of a newde novometagenomic comparative method, called Simka. This method computes a large collection of standard ecological distances by replacing species counts byk-mer counts. Simka scales-up today’s metagenomic projects thanks to a new parallelk-mer counting strategy on multiple datasets.ResultsExperiments on public Human Microbiome Project datasets demonstrate that Simka captures the essential underlying biological structure. Simka was able to compute in a few hours both qualitative and quantitative ecological distances on hundreds of metagenomic samples (690 samples, 32 billions of reads). We also demonstrate that analyzing metagenomes at thek-mer level is highly correlated with extremely precisede novocomparison techniques which rely on all-versus-all sequences alignment strategy or which are based on taxonomic profiling.


Pathogens ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 86
Author(s):  
Erin M. Garcia ◽  
Myrna G. Serrano ◽  
Laahirie Edupuganti ◽  
David J. Edwards ◽  
Gregory A. Buck ◽  
...  

Gardnerella vaginalis has recently been split into 13 distinct species. In this study, we tested the hypotheses that species-specific variations in the vaginolysin (VLY) amino acid sequence could influence the interaction between the toxin and vaginal epithelial cells and that VLY variation may be one factor that distinguishes less virulent or commensal strains from more virulent strains. This was assessed by bioinformatic analyses of publicly available Gardnerella spp. sequences and quantification of cytotoxicity and cytokine production from purified, recombinantly produced versions of VLY. After identifying conserved differences that could distinguish distinct VLY types, we analyzed metagenomic data from a cohort of female subjects from the Vaginal Human Microbiome Project to investigate whether these different VLY types exhibited any significant associations with symptoms or Gardnerella spp.-relative abundance in vaginal swab samples. While Type 1 VLY was most prevalent among the subjects and may be associated with increased reports of symptoms, subjects with Type 2 VLY dominant profiles exhibited increased relative Gardnerella spp. abundance. Our findings suggest that amino acid differences alter the interaction of VLY with vaginal keratinocytes, which may potentiate differences in bacterial vaginosis (BV) immunopathology in vivo.


2014 ◽  
Vol 7 (1) ◽  
pp. 484 ◽  
Author(s):  
Basil Xavier ◽  
Julia Sabirova ◽  
Moons Pieter ◽  
Jean-Pierre Hernalsteens ◽  
Henri de Greve ◽  
...  

2018 ◽  
Vol 85 (10) ◽  
Author(s):  
Reed M. Stubbendieck ◽  
Daniel S. May ◽  
Marc G. Chevrette ◽  
Mia I. Temkin ◽  
Evelyn Wendt-Pienkowski ◽  
...  

ABSTRACTResources available in the human nasal cavity are limited. Therefore, to successfully colonize the nasal cavity, bacteria must compete for scarce nutrients. Competition may occur directly through interference (e.g., antibiotics) or indirectly by nutrient sequestration. To investigate the nature of nasal bacterial competition, we performed coculture inhibition assays between nasalActinobacteriaandStaphylococcusspp. We found that isolates of coagulase-negative staphylococci (CoNS) were sensitive to growth inhibition byActinobacteriabut thatStaphylococcus aureusisolates were resistant to inhibition. AmongActinobacteria, we observed thatCorynebacteriumspp. were variable in their ability to inhibit CoNS. We sequenced the genomes of 10Corynebacteriumspecies isolates, including 3Corynebacterium propinquumisolates that strongly inhibited CoNS and 7 otherCorynebacteriumspecies isolates that only weakly inhibited CoNS. Using a comparative genomics approach, we found that theC. propinquumgenomes were enriched in genes for iron acquisition and harbored a biosynthetic gene cluster (BGC) for siderophore production, absent in the noninhibitoryCorynebacteriumspecies genomes. Using a chrome azurol S assay, we confirmed thatC. propinquumproduced siderophores. We demonstrated that iron supplementation rescued CoNS from inhibition byC. propinquum, suggesting that inhibition was due to iron restriction through siderophore production. Through comparative metabolomics and molecular networking, we identified the siderophore produced byC. propinquumas dehydroxynocardamine. Finally, we confirmed that the dehydroxynocardamine BGC is expressedin vivoby analyzing human nasal metatranscriptomes from the NIH Human Microbiome Project. Together, our results suggest that bacteria produce siderophores to compete for limited available iron in the nasal cavity and improve their fitness.IMPORTANCEWithin the nasal cavity, interference competition through antimicrobial production is prevalent. For instance, nasalStaphylococcusspecies strains can inhibit the growth of other bacteria through the production of nonribosomal peptides and ribosomally synthesized and posttranslationally modified peptides. In contrast, bacteria engaging in exploitation competition modify the external environment to prevent competitors from growing, usually by hindering access to or depleting essential nutrients. As the nasal cavity is a nutrient-limited environment, we hypothesized that exploitation competition occurs in this system. We determined thatCorynebacterium propinquumproduces an iron-chelating siderophore, and this iron-sequestering molecule correlates with the ability to inhibit the growth of coagulase-negative staphylococci. Furthermore, we found that the genes required for siderophore production are expressedin vivo. Thus, although siderophore production by bacteria is often considered a virulence trait, our work indicates that bacteria may produce siderophores to compete for limited iron in the human nasal cavity.


Sign in / Sign up

Export Citation Format

Share Document