scholarly journals Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5364 ◽  
Author(s):  
Jacob T. Nearing ◽  
Gavin M. Douglas ◽  
André M. Comeau ◽  
Morgan G.I. Langille

High-depth sequencing of universal marker genes such as the 16S rRNA gene is a common strategy to profile microbial communities. Traditionally, sequence reads are clustered into operational taxonomic units (OTUs) at a defined identity threshold to avoid sequencing errors generating spurious taxonomic units. However, there have been numerous bioinformatic packages recently released that attempt to correct sequencing errors to determine real biological sequences at single nucleotide resolution by generating amplicon sequence variants (ASVs). As more researchers begin to use high resolution ASVs, there is a need for an in-depth and unbiased comparison of these novel “denoising” pipelines. In this study, we conduct a thorough comparison of three of the most widely-used denoising packages (DADA2, UNOISE3, and Deblur) as well as an open-reference 97% OTU clustering pipeline on mock, soil, and host-associated communities. We found from the mock community analyses that although they produced similar microbial compositions based on relative abundance, the approaches identified vastly different numbers of ASVs that significantly impact alpha diversity metrics. Our analysis on real datasets using recommended settings for each denoising pipeline also showed that the three packages were consistent in their per-sample compositions, resulting in only minor differences based on weighted UniFrac and Bray–Curtis dissimilarity. DADA2 tended to find more ASVs than the other two denoising pipelines when analyzing both the real soil data and two other host-associated datasets, suggesting that it could be better at finding rare organisms, but at the expense of possible false positives. The open-reference OTU clustering approach identified considerably more OTUs in comparison to the number of ASVs from the denoising pipelines in all datasets tested. The three denoising approaches were significantly different in their run times, with UNOISE3 running greater than 1,200 and 15 times faster than DADA2 and Deblur, respectively. Our findings indicate that, although all pipelines result in similar general community structure, the number of ASVs/OTUs and resulting alpha-diversity metrics varies considerably and should be considered when attempting to identify rare organisms from possible background noise.

Author(s):  
Jacob T Nearing ◽  
Gavin M Douglas ◽  
André M Comeau ◽  
Morgan G.I Langille

High-depth sequencing of universal marker genes such as the 16S rRNA gene are a common strategy to profile microbial communities. Traditionally, sequence reads are clustered into operational taxonomic units (OTUs) at a defined identity threshold to avoid sequencing errors generating spurious taxonomic units. However, there have been numerous bioinformatic methods recently released that attempt to correct sequencing errors to determine real biological sequences at single nucleotide resolution by generating amplicon sequence variants (ASVs). As the microbiome field moves from OTUs to higher resolution ASVs, there is a need for an in-depth and unbiased comparison of these novel “denoising” methods. In this study, we conduct a thorough comparison of three of the most widely-used denoising methods on mock, soil, and host-associated communities. We tested three different methods - DADA2, UNOISE3, and Deblur - on four mock communities and found that, although they produced similar microbial compositions based on relative abundance, the methods identified vastly different numbers of ASVs. Our analysis of a soil dataset also showed that the three methods were consistent in their per-sample compositions, resulting in only minor differences based on weighted UniFrac distances. However, DADA2 tended to find more ASVs than the other two methods when analyzing both the real soil data and two other host-associated datasets, suggesting that it could be better at finding rare organisms. The three tested methods were significantly different in their run times, with UNOISE3 running greater than 1200 and 15 times faster than DADA2 and Deblur, respectively. Our results indicate that the choice of denoising method will depend on a researcher’s individual importance for identifying rare ASVs, the availability of computational resources, and their willingness to support open-source or closed-source software.


Author(s):  
Jacob T Nearing ◽  
Gavin M Douglas ◽  
André M Comeau ◽  
Morgan G.I Langille

High-depth sequencing of universal marker genes such as the 16S rRNA gene are a common strategy to profile microbial communities. Traditionally, sequence reads are clustered into operational taxonomic units (OTUs) at a defined identity threshold to avoid sequencing errors generating spurious taxonomic units. However, there have been numerous bioinformatic methods recently released that attempt to correct sequencing errors to determine real biological sequences at single nucleotide resolution by generating amplicon sequence variants (ASVs). As the microbiome field moves from OTUs to higher resolution ASVs, there is a need for an in-depth and unbiased comparison of these novel “denoising” methods. In this study, we conduct a thorough comparison of three of the most widely-used denoising methods on mock, soil, and host-associated communities. We tested three different methods - DADA2, UNOISE3, and Deblur - on four mock communities and found that, although they produced similar microbial compositions based on relative abundance, the methods identified vastly different numbers of ASVs. Our analysis of a soil dataset also showed that the three methods were consistent in their per-sample compositions, resulting in only minor differences based on weighted UniFrac distances. However, DADA2 tended to find more ASVs than the other two methods when analyzing both the real soil data and two other host-associated datasets, suggesting that it could be better at finding rare organisms. The three tested methods were significantly different in their run times, with UNOISE3 running greater than 1200 and 15 times faster than DADA2 and Deblur, respectively. Our results indicate that the choice of denoising method will depend on a researcher’s individual importance for identifying rare ASVs, the availability of computational resources, and their willingness to support open-source or closed-source software.


2021 ◽  
Vol 12 ◽  
Author(s):  
Hannah E. Epstein ◽  
Alejandra Hernandez-Agreda ◽  
Samuel Starko ◽  
Julia K. Baum ◽  
Rebecca Vega Thurber

16S rRNA gene profiling (amplicon sequencing) is a popular technique for understanding host-associated and environmental microbial communities. Most protocols for sequencing amplicon libraries follow a standardized pipeline that can differ slightly depending on laboratory facility and user. Given that the same variable region of the 16S gene is targeted, it is generally accepted that sequencing output from differing protocols are comparable and this assumption underlies our ability to identify universal patterns in microbial dynamics through meta-analyses. However, discrepant results from a combined 16S rRNA gene dataset prepared by two labs whose protocols differed only in DNA polymerase and sequencing platform led us to scrutinize the outputs and challenge the idea of confidently combining them for standard microbiome analysis. Using technical replicates of reef-building coral samples from two species, Montipora aequituberculata and Porites lobata, we evaluated the consistency of alpha and beta diversity metrics between data resulting from these highly similar protocols. While we found minimal variation in alpha diversity between platform, significant differences were revealed with most beta diversity metrics, dependent on host species. These inconsistencies persisted following removal of low abundance taxa and when comparing across higher taxonomic levels, suggesting that bacterial community differences associated with sequencing protocol are likely to be context dependent and difficult to correct without extensive validation work. The results of this study encourage caution in the statistical comparison and interpretation of studies that combine rRNA gene sequence data from distinct protocols and point to a need for further work identifying mechanistic causes of these observed differences.


Author(s):  
O. Bogado Pascottini ◽  
J. F. W. Spricigo ◽  
S. J. Van Schyndel ◽  
B. Mion ◽  
J. Rousseau ◽  
...  

AbstractThis study evaluated the effects of treatment with meloxicam (a non-steroidal anti-inflammatory drug), parity, and blood progesterone concentration on the dynamics of the uterine microbiome of clinically healthy postpartum dairy cows. Seven primiparous and 9 multiparous postpartum Holstein cows received meloxicam (0.5 mg/kg SC, n = 7 cows) once daily for 4 days (10 to 13 days in milk (DIM)) or were untreated (n = 9 cows). Endometrial cytology samples were collected by cytobrush at 10, 21, and 35 DIM, from which the metagenomic analysis was done using 16S rRNA gene sequence analysis. A radioimmunoassay was used to measure progesterone concentration in blood serum samples at 35 DIM and cows were classified as > 1 ng/mL (n = 10) or ≤ 1 ng/mL (n = 6). Alpha diversity for bacterial genera (Chao1, Shannon-Weiner, and Camargo’s evenness indices) were not affected by DIM, meloxicam treatment, parity, or progesterone category (P > 0.2). For beta diversity (genera level), principal coordinate analysis (Bray-Curtis) showed differences in microbiome between parity groups (P = 0.01).There was lower overall abundance of Anaerococcus, Bifidobacterium, Corynebacterium, Lactobacillus, Paracoccus, Staphylococcus, and Streptococcus and higher abundance of Bacillus, Fusobacterium, and Novosphingobium in primiparous than multiparous cows (P < 0.05); these patterns were consistent across sampling days. Bray-Curtis dissimilarity did not differ by DIM at sampling, meloxicam treatment, or progesterone category at 35 DIM (P > 0.5). In conclusion, uterine bacterial composition was not different at 10, 21, or 35 DIM, and meloxicam treatment or progesterone category did not affect uterine microbiota in clinically healthy postpartum dairy cows. Primiparous cows presented a different composition of uterine bacteria than multiparous cows. The differences in microbiome associated with parity might be attributable to changes that occur consequent to the first calving, but this hypothesis should be investigated further.


Agriculture ◽  
2020 ◽  
Vol 10 (4) ◽  
pp. 113 ◽  
Author(s):  
Catello Pane ◽  
Roberto Sorrentino ◽  
Riccardo Scotti ◽  
Marcella Molisso ◽  
Antonio Di Matteo ◽  
...  

Green waste composts are obtained from agricultural production chains; their suppressive properties are increasingly being developed as a promising biological control option in the management of soil-borne phytopathogens. The wide variety of microbes harbored in the compost ecological niches may regulate suppressive functions through not yet fully known underlying mechanisms. This study investigates alpha- and beta-diversity of the compost microbial communities, as indicators of the biological features. Our green composts displayed a differential pattern of suppressiveness over the two assayed pathosystems. Fungal and bacterial densities, as well as catabolic and enzyme functionalities did not correlate with the compost control efficacy on cress disease. Differences in the suppressive potential of composts can be better predicted by the variations in the community levels of physiological profiles indicating that functional alpha-diversity is more predictive than that which is calculated on terminal restriction fragments length polymorphisms (T-RFLPs) targeting the 16S rRNA gene. However, beta-diversity described by nMDS analysis of the Bray–Curtis dissimilarity allowed for separating compost samples into distinct functionally meaningful clusters and indicated that suppressiveness could be regulated by selected groups of microorganisms as major deterministic mechanisms. This study contributes to individuating new suitable characterization procedures applicable to the suppressive green compost chain.


2018 ◽  
Vol 98 (3) ◽  
pp. 498-507 ◽  
Author(s):  
Tadele G. Kiros ◽  
Eric Pinloche ◽  
Romain D’Inca ◽  
Eric Auclair ◽  
Andrew Van Kessel

The considerable animal-to-animal variation in microbial profiles is a challenge in elucidating the role of gut microbiota in host metabolism. The main purpose of this study was, therefore, to develop a pig model with reduced animal-to-animal variation in gut microbial profile. Twelve piglets from four sows were reared conventionally and 12 piglets from four sows were reared artificially in high efficiency particulate air (HEPA) filtered isolators. All isolator-reared piglets were given an artificial colostrum formula containing the combined fecal material from all eight sows. All piglets were killed at 21 d of age and intestinal contents subjected to 16s rRNA gene-based terminal restriction fragment length polymorphism (T-RFLP) profiling. Resulting T-RFLP profiles clustered into two distinct groups representing the two treatment groups. Furthermore, Bray–Curtis dissimilarity distance values and Dice similarity indices showed reduced beta diversity in isolator-reared pigs indicating animal-to-animal variation was reduced in isolator-reared compared to conventional piglets. However, surprisingly, increased alpha diversity was observed in isolator-reared piglets compared with conventional piglets. In conclusion, the study demonstrated that rearing of piglets under conditions of controlled environment reduced animal-to-animal variation in the hindgut microbiota while paradoxically increasing within animal microbial diversity. Isolator rearing may be useful as a model to improve detection of treatment effects on gut microbiota.


2018 ◽  
Author(s):  
Robert C. Edgar ◽  
Henrik Flyvbjerg

AbstractNext-generation sequencing of marker genes such as 16S ribosomal RNA is widely used to survey microbial communities. The abundance distribution (AD) of Operational Taxonomic Units (OTUs) in a sample is typically summarized by alpha diversity metrics, e.g. richness and entropy, discarding information about the AD shape. In this work, we describe octave plots, histograms which visualize the shape of microbial ADs by binning on a logarithmic scale with base 2. Optionally, histogram bars are colored to indicate possible spurious OTUs due to sequence error and cross-talk. Octave plots enable assessment of (a) the shape and completeness of the distribution, (b) the effects of noise on measured diversity, (c) whether low-abundance OTUs should be discarded, (d) whether alpha diversity metrics and estimators are reliable, and (e) the additional sampling effort (i.e., read depth) required to obtain a complete census of the community. The utility of octave plots is illustrated in a re-analysis of a prostate cancer study showing that the reported core microbiome is most likely an artifact of experimental error.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0233943
Author(s):  
O. Bogado Pascottini ◽  
J. F. W. Spricigo ◽  
S. J. Van Schyndel ◽  
B. Mion ◽  
J. Rousseau ◽  
...  

This study evaluated the effects of treatment with meloxicam (a non-steroidal anti-inflammatory drug), parity, and blood progesterone concentration on the dynamics of the uterine microbiota of 16 clinically healthy postpartum dairy cows. Seven primiparous and 9 multiparous postpartum Holstein cows either received meloxicam (0.5 mg/kg SC, n = 7 cows) once daily for 4 days (10 to 13 days in milk (DIM)) or were untreated (n = 9 cows). Endometrial cytology samples were collected by cytobrush at 10, 21, and 35 DIM, from which the microbiota analysis was conducted using 16S rRNA gene sequence analysis. A radioimmunoassay was used to measure progesterone concentration in blood serum samples at 35 DIM and cows were classified as ˃ 1 ng/mL (n = 10) or ≤ 1 ng/mL (n = 6). Alpha diversity for bacterial genera (Chao1, Shannon-Weiner, and Camargo’s evenness indices) were not affected by DIM, meloxicam treatment, parity, or progesterone category. For beta diversity (genera level), principal coordinate analysis (Bray-Curtis) showed differences in microbiota between parity groups. At the phylum level, the relative abundance of Actinobacteria was greater in primiparous than multiparous cows. At the genus level, there was lesser relative abundance of Bifidobacterium, Lactobacillus, Neisseriaceae, Paracoccus, Staphylococcus, and Streptococcus and greater relative abundance of Bacillus and Fusobacterium in primiparous than multiparous cows. Bray-Curtis dissimilarity did not differ by DIM at sampling, meloxicam treatment, or progesterone category at 35 DIM. In conclusion, uterine bacterial composition was not different at 10, 21, or 35 DIM, and meloxicam treatment or progesterone category did not affect the uterine microbiota in clinically healthy postpartum dairy cows. Primiparous cows presented a different composition of uterine bacteria than multiparous cows. The differences in microbiota associated with parity might be attributable to changes that occur consequent to the first calving, but this hypothesis should be investigated further.


2018 ◽  
Author(s):  
Jiarong Guo ◽  
James R. Cole ◽  
C. Titus Brown ◽  
James M. Tiedje

AbstractMany conserved protein-coding core genes are single copy and evolve faster, and thus are more resolving phylogenetic markers than the standard SSU rRNA gene but their use has been precluded by the lack of universal primers. Recent advances in gene targeted assembly methods for large shotgun metagenomes make their use feasible. To evaluate this approach, we compared the variation of two single copy ribosomal protein genes, rplB and rpsC, with the SSU rRNA gene for all completed bacterial genomes in NCBI RefSeq. As expected, among pairwise comparisons of all species that belong to the same genus, 94.9% and 91.0% of the pairs of rplB and rpsC, respectively, showed more variation than did their SSU rRNA gene sequences. We used a gene-targeted assembler, Xander, to assemble rplB and rpsC from shotgun metagenomic data from rhizosphere samples of three crops: corn (annual), and Miscanthus and switchgrass (both perennials). Both protein-coding genes separated all three communities whereas the SSU rRNA gene could only separate the annual from the two perennial communities in ordination analyses. Furthermore, assembled rplB and rpsC yielded significantly higher numbers of OTUs (alpha diversity) than the SSU rRNA gene. These results confirm these faster evolving marker genes offer increased resolution of for comparative microbiome studies.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Robert C. Kaplan ◽  
Zheng Wang ◽  
Mykhaylo Usyk ◽  
Daniela Sotres-Alvarez ◽  
Martha L. Daviglus ◽  
...  

Abstract Background Hispanics living in the USA may have unrecognized potential birthplace and lifestyle influences on the gut microbiome. We report a cross-sectional analysis of 1674 participants from four centers of the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), aged 18 to 74 years old at recruitment. Results Amplicon sequencing of 16S rRNA gene V4 and fungal ITS1 fragments from self-collected stool samples indicate that the host microbiome is determined by sociodemographic and migration-related variables. Those who relocate from Latin America to the USA at an early age have reductions in Prevotella to Bacteroides ratios that persist across the life course. Shannon index of alpha diversity in fungi and bacteria is low in those who relocate to the USA in early life. In contrast, those who relocate to the USA during adulthood, over 45 years old, have high bacterial and fungal diversity and high Prevotella to Bacteroides ratios, compared to USA-born and childhood arrivals. Low bacterial diversity is associated in turn with obesity. Contrasting with prior studies, our study of the Latino population shows increasing Prevotella to Bacteroides ratio with greater obesity. Taxa within Acidaminococcus, Megasphaera, Ruminococcaceae, Coriobacteriaceae, Clostridiales, Christensenellaceae, YS2 (Cyanobacteria), and Victivallaceae are significantly associated with both obesity and earlier exposure to the USA, while Oscillospira and Anaerotruncus show paradoxical associations with both obesity and late-life introduction to the USA. Conclusions Our analysis of the gut microbiome of Latinos demonstrates unique features that might be responsible for health disparities affecting Hispanics living in the USA.


Sign in / Sign up

Export Citation Format

Share Document