A high-resolution pipeline for 16S-sequencing identifies bacterial strains in human microbiome

Mapping Intimacies ◽

10.1101/565572 ◽

2019 ◽

Cited By ~ 1

Author(s):

Igor Segota ◽

Tao Long

Keyword(s):

Bacterial Species ◽

Human Microbiome ◽

Amplicon Sequencing ◽

R Package ◽

Strain Level ◽

Sequencing Data ◽

Bacterial Strains ◽

16S Sequencing ◽

16S Amplicon Sequencing ◽

Sequencing Data Analysis

We developed a High-resolution Microbial Analysis Pipeline (HiMAP) for 16S amplicon sequencing data analysis, aiming at bacterial species or strain-level identification from human microbiome to enable experimental validation for causal effects of the associated bacterial strains on health and diseases. HiMAP achieved higher accuracy in identifying species in human microbiome mock community than other pipelines. HiMAP identified majority of the species, with strain-level resolution wherever possible, as detected by whole genome shotgun sequencing using MetaPhlAn2 and reported comparable relative abundances. HiMAP is an open-source R package available at https://github.com/taolonglab/himap.

Download Full-text

Meta-Apo improves accuracy of 16S-amplicon-based prediction of microbiome function

BMC Genomics ◽

10.1186/s12864-020-07307-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Gongchao Jing ◽

Yufeng Zhang ◽

Wenzhi Cui ◽

Lu Liu ◽

Jian Xu ◽

...

Keyword(s):

16S Rrna ◽

Large Scale ◽

Low Cost ◽

Human Microbiome ◽

Amplicon Sequencing ◽

Training Sample ◽

Rrna Gene ◽

16S Amplicon Sequencing ◽

Cross Platform ◽

Functional Profiles

Abstract Background Due to their much lower costs in experiment and computation than metagenomic whole-genome sequencing (WGS), 16S rRNA gene amplicons have been widely used for predicting the functional profiles of microbiome, via software tools such as PICRUSt 2. However, due to the potential PCR bias and gene profile variation among phylogenetically related genomes, functional profiles predicted from 16S amplicons may deviate from WGS-derived ones, resulting in misleading results. Results Here we present Meta-Apo, which greatly reduces or even eliminates such deviation, thus deduces much more consistent diversity patterns between the two approaches. Tests of Meta-Apo on > 5000 16S-rRNA amplicon human microbiome samples from 4 body sites showed the deviation between the two strategies is significantly reduced by using only 15 WGS-amplicon training sample pairs. Moreover, Meta-Apo enables cross-platform functional comparison between WGS and amplicon samples, thus greatly improve 16S-based microbiome diagnosis, e.g. accuracy of gingivitis diagnosis via 16S-derived functional profiles was elevated from 65 to 95% by WGS-based classification. Therefore, with the low cost of 16S-amplicon sequencing, Meta-Apo can produce a reliable, high-resolution view of microbiome function equivalent to that offered by shotgun WGS. Conclusions This suggests that large-scale, function-oriented microbiome sequencing projects can probably benefit from the lower cost of 16S-amplicon strategy, without sacrificing the precision in functional reconstruction that otherwise requires WGS. An optimized C++ implementation of Meta-Apo is available on GitHub (https://github.com/qibebt-bioinfo/meta-apo) under a GNU GPL license. It takes the functional profiles of a few paired WGS:16S-amplicon samples as training, and outputs the calibrated functional profiles for the much larger number of 16S-amplicon samples.

Download Full-text

Gut Microbiota as a Source of Uremic Toxins

International Journal of Molecular Sciences ◽

10.3390/ijms23010483 ◽

2022 ◽

Vol 23 (1) ◽

pp. 483

Author(s):

Vasily A. Popkov ◽

Anastasia A. Zharikova ◽

Evgenia A. Demchenko ◽

Nadezda V. Andrianova ◽

Dmitry B. Zorov ◽

...

Keyword(s):

Gut Microbiota ◽

Bacterial Species ◽

Human Microbiome ◽

Human Microbiome Project ◽

Uremic Toxins ◽

Enzymatic Reactions ◽

Bacterial Strains ◽

Excretory Function ◽

High Concentrations ◽

Stage Renal Disease

Uremic retention solutes are the compounds that accumulate in the blood when kidney excretory function is impaired. Some of these compounds are toxic at high concentrations and are usually known as “uremic toxins”. The cumulative detrimental effect of uremic toxins results in numerous health problems and eventually mortality during acute or chronic uremia, especially in end-stage renal disease. More than 100 different solutes increase during uremia; however, the exact origin for most of them is still debatable. There are three main sources for such compounds: exogenous ones are consumed with food, whereas endogenous ones are produced by the host metabolism or by symbiotic microbiota metabolism. In this article, we identify uremic retention solutes presumably of gut microbiota origin. We used database analysis to obtain data on the enzymatic reactions in bacteria and human organisms that potentially yield uremic retention solutes and hence to determine what toxins could be synthesized in bacteria residing in the human gut. We selected biochemical pathways resulting in uremic retention solutes synthesis related to specific bacterial strains and revealed links between toxin concentration in uremia and the proportion of different bacteria species which can synthesize the toxin. The detected bacterial species essential for the synthesis of uremic retention solutes were then verified using the Human Microbiome Project database. Moreover, we defined the relative abundance of human toxin-generating enzymes as well as the possibility of the synthesis of a particular toxin by the human metabolism. Our study presents a novel bioinformatics approach for the elucidation of the origin of both uremic retention solutes and uremic toxins and for searching for the most likely human microbiome producers of toxins that can be targeted and used for the therapy of adverse consequences of uremia.

Download Full-text

microeco: An R package for data mining in microbial community ecology

FEMS Microbiology Ecology ◽

10.1093/femsec/fiaa255 ◽

2020 ◽

Author(s):

Chi Liu ◽

Yaoming Cui ◽

Xiangzhen Li ◽

Minjie Yao

Keyword(s):

Data Mining ◽

Microbial Community ◽

Community Ecology ◽

Amplicon Sequencing ◽

R Package ◽

Environmental Data ◽

Venn Diagram ◽

Diversity Analysis ◽

Sequencing Data ◽

Microbial Community Ecology

Abstract A large amount of sequencing data is produced in microbial community ecology studies using the high-throughput sequencing technique, especially amplicon-sequencing-based community data. After conducting the initial bioinformatic analysis of amplicon sequencing data, performing the subsequent statistics and data mining based on the operational taxonomic unit and taxonomic assignment tables is still complicated and time-consuming. To address this problem, we present an integrated R package-‘microeco’ as an analysis pipeline for treating microbial community and environmental data. This package was developed based on the R6 class system and combines a series of commonly used and advanced approaches in microbial community ecology research. The package includes classes for data preprocessing, taxa abundance plotting, venn diagram, alpha diversity analysis, beta diversity analysis, differential abundance test and indicator taxon analysis, environmental data analysis, null model analysis, network analysis and functional analysis. Each class is designed to provide a set of approaches that can be easily accessible to users. Compared with other R packages in the microbial ecology field, the microeco package is fast, flexible and modularized to use, and provides powerful and convenient tools for researchers. The microeco package can be installed from CRAN (The Comprehensive R Archive Network) or github (https://github.com/ChiLiubio/microeco).

Download Full-text

Re-Analysis of 16S Amplicon Sequencing Data Reveals Soil Microbial Population Shifts in Rice Fields under Drought Condition

Rice ◽

10.1186/s12284-020-00403-6 ◽

2020 ◽

Vol 13 (1) ◽

Author(s):

Seok-Won Jang ◽

Myeong-Hyun Yoou ◽

Woo-Jong Hong ◽

Yeon-Ju Kim ◽

Eun-Jin Lee ◽

...

Keyword(s):

Microbial Population ◽

Amplicon Sequencing ◽

Rice Fields ◽

Drought Condition ◽

Sequencing Data ◽

Soil Microbial ◽

16S Amplicon Sequencing ◽

Soil Microbial Population

Download Full-text

Reference-based error correction of amplicon sequencing data from synthetic communities

10.1101/2021.01.15.426834 ◽

2021 ◽

Author(s):

Pengfan Zhang ◽

Stjin Spaepen ◽

Yang Bai ◽

Stephane Hacquard ◽

Ruben Garrido-Oter

Keyword(s):

Microbial Communities ◽

Amplicon Sequencing ◽

R Package ◽

Fungal Communities ◽

Polymorphic Variation ◽

Sequencing Data ◽

Extensive Evaluation ◽

Culture Independent ◽

Reference Sequences ◽

Synthetic Microbial Communities

AbstractMotivationSynthetic microbial communities (SynComs) constitute an emergent and powerful tool in biological, biomedical, and biotechnological research. Despite recent advances in algorithms for analysis of culture-independent amplicon sequencing data from microbial communities, there is a lack of tools specifically designed for analysing SynCom data, where reference sequences for each strain are available.ResultsHere we present Rbec, a tool designed for analysing SynCom data that outperforms current methods by accurately correcting errors in amplicon sequences and identifying intra-strain polymorphic variation. Extensive evaluation using mock bacterial and fungal communities show that our tool performs robustly for samples of varying complexity, diversity, and sequencing depth. Further, Rbec also allows accurate detection of contaminations in SynCom experiments.AvailabilityRbec is freely available as an open-source R package and can be downloaded at: https://github.com/PengfanZhang/Microbiome.

Download Full-text

MALT: Fast alignment and analysis of metagenomic DNA sequence data applied to the Tyrolean Iceman

10.1101/050559 ◽

2016 ◽

Cited By ~ 38

Author(s):

Alexander Herbig ◽

Frank Maixner ◽

Kirsten I. Bos ◽

Albert Zink ◽

Johannes Krause ◽

...

Keyword(s):

Large Scale ◽

Sequence Data ◽

Bacterial Species ◽

Human Microbiome ◽

Metagenomic Analysis ◽

Sequencing Data ◽

Metagenomic Dna ◽

Alignment Procedure ◽

Taxonomic Profile ◽

Tyrolean Iceman

AbstractModern next generation sequencing technologies produce vast amounts of data in the context of large-scale metagenomic studies, in which complex microbial communities can be reconstructed to an unprecedented level of detail. Most prominent examples are human microbiome studies that correlate the bacterial taxonomic profile with specific physiological conditions or diseases.In order to perform these analyses high-throughput computational tools are needed that are able to process these data within a short time while preserving a high level of sensitivity and specificity.Here we present MALT (MEGAN ALignment Tool) a program for the ultrafast alignment and analysis of metagenomic DNA sequencing data. MALT processes hundreds of millions of sequencing reads within only a few hours. In addition to the alignment procedure MALT implements a taxonomic binning algorithm that is able to specifically assign reads to bacterial species. Its tight integration with the interactive metagenomic analysis software MEGAN allows for visualization and further analyses of results.We demonstrate MALT by its application to the metagenomic analysis of two ancient microbiomes from oral cavity and lung samples of the 5,300-year-old Tyrolean Iceman. Despite the strong environmental background, MALT is able to pick up the weak signal of the original microbiomes and identifies multiple species that are typical representatives of the respective host environment.

Download Full-text

RETA: An R package for whole exome and targeted region sequencing data analysis

10.1101/121384 ◽

2017 ◽

Author(s):

Mengbiao Guo ◽

Jing Yang ◽

Yu lung Lau ◽

Wanling Yang

Keyword(s):

Data Analysis ◽

R Package ◽

Targeted Sequencing ◽

Sequencing Data ◽

Comprehensive Understanding ◽

Mendelian Diseases ◽

Whole Exome ◽

One Stop ◽

Sequencing Data Analysis ◽

Advanced Visualization

AbstractWhole exome and targeted sequencing have been playing a major role in diagnoses of Mendelian diseases, but analysis of these data involves using many complicated tools and comprehensive understanding of the analysis results is difficult.Here, we report RETA, an R package to provide a one-stop analysis of these data and a comprehensive, interactive and easy-to-understand report with many advanced visualization features. It facilitates clinicians and scientists alike to better analyze and interpret this type of sequencing data for disease diagnoses.Availability and implementationhttps://github.com/reta-s/reta/[email protected]

Download Full-text

Identifying prognostic pairwise relationships among bacterial species in microbiome studies

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009501 ◽

2021 ◽

Vol 17 (11) ◽

pp. e1009501

Author(s):

Sean M. Devlin ◽

Axel Martin ◽

Irina Ostrovnaya

Keyword(s):

Bacterial Species ◽

Human Microbiome ◽

16S Rrna Gene Sequencing ◽

R Package ◽

Major Influence ◽

Rrna Gene ◽

Data Set ◽

Cancer Data ◽

Selection Framework ◽

Rrna Gene Sequencing

In recent literature, the human microbiome has been shown to have a major influence on human health. To investigate this impact, scientists study the composition and abundance of bacterial species, commonly using 16S rRNA gene sequencing, among patients with and without a disease or condition. Methods for such investigations to date have focused on the association between individual bacterium and an outcome, and higher-order pairwise relationships or interactions among bacteria are often avoided due to the substantial increase in dimension and the potential for spurious correlations. However, overlooking such relationships ignores the environment of the microbiome, where there is dynamic cooperation and competition among bacteria. We present a method for identifying and ranking pairs of bacteria that have a differential dichotomized relationship across outcomes. Our approach, implemented in an R package PairSeek, uses the stability selection framework with data-driven dichotomized forms of the pairwise relationships. We illustrate the properties of the proposed method using a published oral cancer data set and a simulation study.

Download Full-text

MicroNiche: an R package for assessing microbial niche breadth and overlap from amplicon sequencing data

FEMS Microbiology Ecology ◽

10.1093/femsec/fiaa131 ◽

2020 ◽

Vol 96 (8) ◽

Author(s):

D R Finn ◽

J Yu ◽

Z E Ilhan ◽

V M C Fernandes ◽

C R Penton ◽

...

Keyword(s):

In Silico ◽

Niche Breadth ◽

Biological Soil Crust ◽

A Priori ◽

Amplicon Sequencing ◽

R Package ◽

Training Data ◽

Sequencing Data ◽

Human Gut ◽

Limit Of Quantification

ABSTRACT Niche is a fundamental concept in ecology. It integrates the sum of biotic and abiotic environmental requirements that determines a taxon's distribution. Microbiologists currently lack quantitative approaches to address niche-related hypotheses. We tested four approaches for the quantification of niche breadth and overlap of taxa in amplicon sequencing datasets, with the goal of determining generalists, specialists and environmental-dependent distributions of community members. We applied these indices to in silico training datasets first, and then to real human gut and desert biological soil crust (biocrust) case studies, assessing the agreement of the indices with previous findings. Implementation of each approach successfully identified a priori conditions within in silico training data, and we found that by including a limit of quantification based on species rank, one could identify taxa falsely classified as specialists because of their low, sparse counts. Analysis of the human gut study offered quantitative support for Bacilli, Gammaproteobacteria and Fusobacteria specialists enriched after bariatric surgery. We could quantitatively characterise differential niche distributions of cyanobacterial taxa with respect to precipitation gradients in biocrusts. We conclude that these approaches, made publicly available as an R package (MicroNiche), represent useful tools to assess microbial environment-taxon and taxon-taxon relationships in a quantitative manner.

Download Full-text

Implications of error-prone long-read whole-genome shotgun sequencing on characterizing reference microbiomes

10.1101/2020.03.05.978866 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yu Hu ◽

Li Fang ◽

Christopher Nicholson ◽

Kai Wang

Keyword(s):

Human Microbiome ◽

Human Microbiome Project ◽

Error Rates ◽

Tool Development ◽

Sequencing Data ◽

Bacterial Strains ◽

Microbial Genomes ◽

Oxford Nanopore ◽

Long Read ◽

Adequate Coverage

SummaryLong-read sequencing techniques, such as the Oxford Nanopore Technology, can generate reads that are tens of kilobases in length, and are therefore particularly relevant for microbiome studies. However, due to the higher per-base error rates than typical short-read sequencing, the application of long-read sequencing on microbiomes remains largely unexplored. Here we deeply sequenced two human microbiota mock community samples (HM-276D and HM-277D) from the Human Microbiome Project. We showed that assembly programs consistently achieved high accuracy (~99%) and completeness (~99%) for bacterial strains with adequate coverage. We also found that long-read sequencing provides accurate estimates of species-level abundance (R=0.94 for 20 bacteria with abundance ranging from 0.005% to 64%). Our results demonstrate the feasibility to characterize complete microbial genomes and populations from error-prone Nanopore sequencing data, but also highlight necessary bioinformatics improvements for future metagenomics tool development.

Download Full-text