scholarly journals TAG.ME: Taxonomic Assignment of Genetic Markers for Ecology

2018 ◽  
Author(s):  
Douglas Eduardo Valente Pires ◽  
Francislon Silva Oliveira ◽  
Felipe Borim Correa ◽  
Daniel Kumazawa Morais ◽  
Gabriel Rocha Fernandes

1.AbstractBackgroundSequencing of amplified genetic markers, such as the 16S rRNA gene, have been extensively used to characterize microbial community composition. Recent studies suggested that Amplicon Sequences Variants (ASV) should replace the Operational Taxonomic Units (OTU), given the arbitrary definition of sequence identity thresholds used to define units. Alignment-free methods are an interesting alternative for the taxonomic classification of the ASVs, preventing the introduction of biases from sequence identity thresholds.ResultsHere we present TAG.ME, a novel alignment-independent and amplicon-specific method for taxonomic assignment based on genetic markers. TAG.ME uses a multilevel supervised learning approach to create predictive models based on user-defined genetic marker genes. The predictive method can assign taxonomy to sequenced amplicons efficiently and effectively. We applied our method to assess gut and soil sample classification, and it outperformed alternative approaches, identifying a substantially larger proportion of species. Benchmark tests performed using the RDP database, and Mock communities reinforced the precise classification into deep taxonomic levels.ConclusionTAG.ME presents a new approach to assign taxonomy to amplicon sequences accurately. Our classification model, trained with amplicon specific sequences, can address resolution issues not solved by other methods and approaches that use the whole 16S rRNA gene sequence. TAG.ME is implemented as an R package and is freely available at http://gabrielrfernandes.github.io/tagme/

2021 ◽  
Vol 9 (8) ◽  
pp. 1570
Author(s):  
Chien-Hsun Huang ◽  
Chih-Chieh Chen ◽  
Yu-Chun Lin ◽  
Chia-Hsuan Chen ◽  
Ai-Yun Lee ◽  
...  

The current taxonomy of the Lactiplantibacillus plantarum group comprises of 17 closely related species that are indistinguishable from each other by using commonly used 16S rRNA gene sequencing. In this study, a whole-genome-based analysis was carried out for exploring the highly distinguished target genes whose interspecific sequence identity is significantly less than those of 16S rRNA or conventional housekeeping genes. In silico analyses of 774 core genes by the cano-wgMLST_BacCompare analytics platform indicated that csbB, morA, murI, mutL, ntpJ, rutB, trmK, ydaF, and yhhX genes were the most promising candidates. Subsequently, the mutL gene was selected, and the discrimination power was further evaluated using Sanger sequencing. Among the type strains, mutL exhibited a clearly superior sequence identity (61.6–85.6%; average: 66.6%) to the 16S rRNA gene (96.7–100%; average: 98.4%) and the conventional phylogenetic marker genes (e.g., dnaJ, dnaK, pheS, recA, and rpoA), respectively, which could be used to separat tested strains into various species clusters. Consequently, species-specific primers were developed for fast and accurate identification of L. pentosus, L. argentoratensis, L. plantarum, and L. paraplantarum. During this study, one strain (BCRC 06B0048, L. pentosus) exhibited not only relatively low mutL sequence identities (97.0%) but also a low digital DNA–DNA hybridization value (78.1%) with the type strain DSM 20314T, signifying that it exhibits potential for reclassification as a novel subspecies. Our data demonstrate that mutL can be a genome-wide target for identifying and classifying the L. plantarum group species and for differentiating novel taxa from known species.


2022 ◽  
Vol 12 ◽  
Author(s):  
Ilona A. Ruhl ◽  
Andriy Sheremet ◽  
Chantel C. Furgason ◽  
Susanne Krause ◽  
Robert M. Bowers ◽  
...  

GAL08 are bacteria belonging to an uncultivated phylogenetic cluster within the phylum Acidobacteria. We detected a natural population of the GAL08 clade in sediment from a pH-neutral hot spring located in British Columbia, Canada. To shed light on the abundance and genomic potential of this clade, we collected and analyzed hot spring sediment samples over a temperature range of 24.2–79.8°C. Illumina sequencing of 16S rRNA gene amplicons and qPCR using a primer set developed specifically to detect the GAL08 16S rRNA gene revealed that absolute and relative abundances of GAL08 peaked at 65°C along three temperature gradients. Analysis of sediment collected over multiple years and locations revealed that the GAL08 group was consistently a dominant clade, comprising up to 29.2% of the microbial community based on relative read abundance and up to 4.7 × 105 16S rRNA gene copy numbers per gram of sediment based on qPCR. Using a medium quality threshold, 25 single amplified genomes (SAGs) representing these bacteria were generated from samples taken at 65 and 77°C, and seven metagenome-assembled genomes (MAGs) were reconstructed from samples collected at 45–77°C. Based on average nucleotide identity (ANI), these SAGs and MAGs represented three separate species, with an estimated average genome size of 3.17 Mb and GC content of 62.8%. Phylogenetic trees constructed from 16S rRNA gene sequences and a set of 56 concatenated phylogenetic marker genes both placed the three GAL08 bacteria as a distinct subgroup of the phylum Acidobacteria, representing a candidate order (Ca. Frugalibacteriales) within the class Blastocatellia. Metabolic reconstructions from genome data predicted a heterotrophic metabolism, with potential capability for aerobic respiration, as well as incomplete denitrification and fermentation. In laboratory cultivation efforts, GAL08 counts based on qPCR declined rapidly under atmospheric levels of oxygen but increased slightly at 1% (v/v) O2, suggesting a microaerophilic lifestyle.


2019 ◽  
Vol 85 (7) ◽  
Author(s):  
Alexander Burkert ◽  
Thomas A. Douglas ◽  
Mark P. Waldrop ◽  
Rachel Mackelprang

ABSTRACTPermafrost hosts a community of microorganisms that survive and reproduce for millennia despite extreme environmental conditions, such as water stress, subzero temperatures, high salinity, and low nutrient availability. Many studies focused on permafrost microbial community composition use DNA-based methods, such as metagenomics and 16S rRNA gene sequencing. However, these methods do not distinguish among active, dead, and dormant cells. This is of particular concern in ancient permafrost, where constant subzero temperatures preserve DNA from dead organisms and dormancy may be a common survival strategy. To circumvent this, we applied (i) LIVE/DEAD differential staining coupled with microscopy, (ii) endospore enrichment, and (iii) selective depletion of DNA from dead cells to permafrost microbial communities across a Pleistocene permafrost chronosequence (19,000, 27,000, and 33,000 years old). Cell counts and analysis of 16S rRNA gene amplicons from live, dead, and dormant cells revealed how communities differ between these pools, how they are influenced by soil physicochemical properties, and whether they change over geologic time. We found evidence that cells capable of forming endospores are not necessarily dormant and that members of the classBacilliwere more likely to form endospores in response to long-term stressors associated with permafrost environmental conditions than members of theClostridia, which were more likely to persist as vegetative cells in our older samples. We also found that removing exogenous “relic” DNA preserved within permafrost did not significantly alter microbial community composition. These results link the live, dead, and dormant microbial communities to physicochemical characteristics and provide insights into the survival of microbial communities in ancient permafrost.IMPORTANCEPermafrost soils store more than half of Earth’s soil carbon despite covering ∼15% of the land area (C. Tarnocai et al., Global Biogeochem Cycles 23:GB2023, 2009, https://doi.org/10.1029/2008GB003327). This permafrost carbon is rapidly degraded following a thaw (E. A. G. Schuur et al., Nature 520:171–179, 2015, https://doi.org/10.1038/nature14338). Understanding microbial communities in permafrost will contribute to the knowledge base necessary to understand the rates and forms of permafrost C and N cycling postthaw. Permafrost is also an analog for frozen extraterrestrial environments, and evidence of viable organisms in ancient permafrost is of interest to those searching for potential life on distant worlds. If we can identify strategies microbial communities utilize to survive in permafrost, it may yield insights into how life (if it exists) survives in frozen environments outside of Earth. Our work is significant because it contributes to an understanding of how microbial life adapts and survives in the extreme environmental conditions in permafrost terrains.


2015 ◽  
Vol 65 (Pt_1) ◽  
pp. 251-259 ◽  
Author(s):  
Patricia L. Tavormina ◽  
Roland Hatzenpichler ◽  
Shawn McGlynn ◽  
Grayson Chadwick ◽  
Katherine S. Dawson ◽  
...  

We report the isolation and growth characteristics of a gammaproteobacterial methane-oxidizing bacterium (Methylococcaceae strain WF1T, ‘whale fall 1’) that shares 98 % 16S rRNA gene sequence identity with uncultivated free-living methanotrophs and the methanotrophic endosymbionts of deep-sea mussels, ≤94.6 % 16S rRNA gene sequence identity with species of the genus Methylobacter and ≤93.6 % 16S rRNA gene sequence identity with species of the genera Methylomonas and Methylosarcina . Strain WF1T represents the first cultivar from the ‘deep sea-1’ clade of marine methanotrophs, which includes members that participate in methane oxidation in sediments and the water column in addition to mussel endosymbionts. Cells of strain WF1T were elongated cocci, approximately 1.5 µm in diameter, and occurred singly, in pairs and in clumps. The cell wall was Gram-negative, and stacked intracytoplasmic membranes and storage granules were evident. The genomic DNA G+C content of WF1T was 40.5 mol%, significantly lower than that of currently described cultivars, and the major fatty acids were 16 : 0, 16 : 1ω9c, 16 : 1ω9t, 16 : 1ω8c and 16 : 2ω9,14. Growth occurred in liquid media at an optimal temperature of 23 °C, and was dependent on the presence of methane or methanol. Atmospheric nitrogen could serve as the sole nitrogen source for WF1T, a capacity that had not been functionally demonstrated previously in members of Methylobacter . On the basis of its unique morphological, physiological and phylogenetic properties, this strain represents the type species within a new genus, and we propose the name Methyloprofundus sedimenti gen. nov., sp. nov. The type strain of Methyloprofundus sedimenti is WF1T ( = LMG 28393T = ATCC BAA-2619T).


2015 ◽  
Vol 1130 ◽  
pp. 63-66 ◽  
Author(s):  
Lorena Escudero ◽  
Jonathan Bijman ◽  
Guajardo M. Mariela ◽  
Juan José Pueyo Mur ◽  
Guillermo Chong ◽  
...  

To understand the microbial community inhabiting in an acidic salt flat the phylogenetic diversity and the geochemistry of this system was compared to acid mine drainage (AMD) systems. The microbial community structure was assessed by DNA extraction/PCR/DGGE and secuencing for the 16S rRNA gene and the geochemistry was analyzed using several approaches. Prediction of metagenome functional content was performed from the 16S rRNA gene survey using the bioinformatics software package Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt). The geochemical results revealed a much lower iron concentration in the salt flat than in AMD systems (39 and 21804 mg L-1, respectively) and a significant difference in chloride levels. Sequences inferred to be from potential sulfur metabolizing organisms constituted up to 70% of the microbial community in the acidic salt flat meanwhile predominat iron-metabolizing acidophile populations were reported in AMD systems. Interestingly, the microbial assemblage in the acidic salt flat was dominated by mixotrophic and organotrophic sulfur oxidizers as well as by photoautotrophic acidophiles. Our results suggests that the salt concentration in Salar de Gorbea (average Cl-= 40 gL-1) is in the limit for the occurrence of chemolithotrophic oxidation of sulfur compounds. In addition, the investigation allows concluding that salinity rather than extremes of pH is the major environmental determinant of microbial community composition.


2018 ◽  
Author(s):  
Hugo R Barajas de la Torre ◽  
Miguel Romero ◽  
Shamayim Martínez-Sánchez ◽  
Luis D Alcaraz

Background. Comparative genomics between closely related bacterial strains can distinguish important features determining pathogenesis, antibiotic resistance, and phylogenetic structure. The Streptococcus genus is relevant to public health and food safety and it is well-represented (>100 genomes) in databases of publicly available databases. Streptococci are cosmopolitan, with multiple sources of isolation, from humans to dairy products. The Streptococcus genus has been classified by morphology, serotypes, 16S rRNA gene, and Multi Locus Sequence Types (MLST). The Genomic Similarity Score (GSS) is proposed as a tool to quantify genome level relatedness between species of Streptococcus. The Streptococcus core genome can be used to assess strain specific abundances in metagenomic sequences. Methods. A 16S rRNA gene phylogeny was calculated for 108 strains, belonging to 16 Streptococcus species and compared to a dendrogram using GSS pairwise distances for the same genomes. The core and pan-genome were calculated for these 108 genomes. The core genome sequences were analyzed and used as a resource to discriminate homologous fragment reads from closely related strains in metagenomic samples. Results. A total of 404 proteins are shared by all 108 Streptococcus genomes, which is the core genome. The pairwise amino acid identity values of the core proteins for all the compared strains and outgroups are reported. Lower sequence identity variation (90-100%) is predominantly found in core clusters containing ribosomal and translation-related proteins. For 48 core proteins (11.8%) no functional assignment could be made and those proteins have larger sequence identity variations than other core proteins. The sequence identity of the core genome diminishes as GSS score between species decreases. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). Finally, the core genome was used to distinguish between closely related species within human oral metagenomes. Discussion. The Streptococcus genus provides a benchmark dataset for comparative genomic studies due to the breath depth of genomic coverage. Comparing metagenomic shotgun fragment reads to the core genome using rapid alignment tools allows species-specific abundance estimates in metagenomic samples. Understanding of genomic variability and strains relatedness is the goal of tools like GSS, which make use of both pairwise shared core and pan-genomic homologous shared sequences for its calculation.


mSphere ◽  
2018 ◽  
Vol 3 (5) ◽  
Author(s):  
Robin R. Rohwer ◽  
Joshua J. Hamilton ◽  
Ryan J. Newton ◽  
Katherine D. McMahon

ABSTRACT Taxonomy assignment of freshwater microbial communities is limited by the minimally curated phylogenies used for large taxonomy databases. Here we introduce TaxAss, a taxonomy assignment workflow that classifies 16S rRNA gene amplicon data using two taxonomy reference databases: a large comprehensive database and a small ecosystem-specific database rigorously curated by scientists within a field. We applied TaxAss to five different freshwater data sets using the comprehensive SILVA database and the freshwater-specific FreshTrain database. TaxAss increased the percentage of the data set classified compared to using only SILVA, especially at fine-resolution family to species taxon levels, while across the freshwater test data sets classifications increased by as much as 11 to 40% of total reads. A similar increase in classifications was not observed in a control mouse gut data set, which was not expected to contain freshwater bacteria. TaxAss also maintained taxonomic richness compared to using only the FreshTrain across all taxon levels from phylum to species. Without TaxAss, most organisms not represented in the FreshTrain were unclassified, but at fine taxon levels, incorrect classifications became significant. We validated TaxAss using simulated amplicon data derived from full-length clone libraries and found that 96 to 99% of test sequences were correctly classified at fine resolution. TaxAss splits a data set’s sequences into two groups based on their percent identity to reference sequences in the ecosystem-specific database. Sequences with high similarity to sequences in the ecosystem-specific database are classified using that database, and the others are classified using the comprehensive database. TaxAss is free and open source and is available at https://www.github.com/McMahonLab/TaxAss. IMPORTANCE Microbial communities drive ecosystem processes, but microbial community composition analyses using 16S rRNA gene amplicon data sets are limited by the lack of fine-resolution taxonomy classifications. Coarse taxonomic groupings at the phylum, class, and order levels lump ecologically distinct organisms together. To avoid this, many researchers define operational taxonomic units (OTUs) based on clustered sequences, sequence variants, or unique sequences. These fine-resolution groupings are more ecologically relevant, but OTU definitions are data set dependent and cannot be compared between data sets. Microbial ecologists studying freshwater have curated a small, ecosystem-specific taxonomy database to provide consistent and up-to-date terminology. We created TaxAss, a workflow that leverages this database to assign taxonomy. We found that TaxAss improves fine-resolution taxonomic classifications (family, genus, and species). Fine taxonomic groupings are more ecologically relevant, so they provide an alternative to OTU-based analyses that is consistent and comparable between data sets.


PLoS ONE ◽  
2015 ◽  
Vol 10 (2) ◽  
pp. e0116955 ◽  
Author(s):  
Lucas Sinclair ◽  
Omneya Ahmed Osman ◽  
Stefan Bertilsson ◽  
Alexander Eiler

Author(s):  
Jane E. Sykes ◽  
Louise M. Ball ◽  
Nathan L. Bailiff ◽  
Michael M. Fry

A novel small haemoplasma was detected following cytological examination of blood smears from a splenectomized dog with haemic neoplasia. The 16S rRNA and rnpB genes of the organism were partially sequenced and a phylogenetic tree constructed. The organism was most closely related to the small feline haemoplasma, ‘Candidatus Mycoplasma haemominutum’ (94 % 16S rRNA gene nucleotide sequence identity; 75 % rnpB) and was only distantly related to Mycoplasma haemocanis (78 % 16S rRNA gene nucleotide sequence identity; 65 % rnpB). As this organism has not been cultured in vitro, the candidate species name ‘Candidatus Mycoplasma haematoparvum’ is proposed.


Sign in / Sign up

Export Citation Format

Share Document