scholarly journals GMPR: A novel normalization method for microbiome sequencing data

2017 ◽  
Author(s):  
Li Chen ◽  
Jun Chen

ABSTRACTSummaryNormalization is the first and a critical step in microbiome sequencing (microbiome-Seq) data analysis to account for variable library sizes. Though RNA-Seq based normalization methods have been adapted for microbiome-Seq data, they fail to consider the unique characteristics of microbiome-Seq data, which contain a vast number of zeros due to the physical absence or undersampling of the microbes. Normalization methods that specifically address the zeroinflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zeroinflated sequencing data such as microbiome-Seq data. Simulation studies and analyses of 38 real gut microbiome datasets from 16S rRNA gene amplicon sequencing demonstrated the superior performance of the proposed method.Availability and Implementation‘GMPR’ is implemented in R andavailable at https://github.com/jchen1981/GMPRSupplementary InformationSupplementary data are available at Bioinformatics [email protected]

2017 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Jun Chen

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


2018 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Xuefeng Wang ◽  
...  

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


2017 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbin Huang ◽  
Jun Chen

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4600 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Xuefeng Wang ◽  
...  

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero-inflation remain largely undeveloped. Here we propose geometric mean of pairwise ratios—a simple but effective normalization method—for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


2018 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Xuefeng Wang ◽  
...  

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Oksana Kutsyr ◽  
Lucía Maestre-Carballa ◽  
Mónica Lluesma-Gomez ◽  
Manuel Martinez-Garcia ◽  
Nicolás Cuenca ◽  
...  

AbstractThe gut microbiome is known to influence the pathogenesis and progression of neurodegenerative diseases. However, there has been relatively little focus upon the implications of the gut microbiome in retinal diseases such as retinitis pigmentosa (RP). Here, we investigated changes in gut microbiome composition linked to RP, by assessing both retinal degeneration and gut microbiome in the rd10 mouse model of RP as compared to control C57BL/6J mice. In rd10 mice, retinal responsiveness to flashlight stimuli and visual acuity were deteriorated with respect to observed in age-matched control mice. This functional decline in dystrophic animals was accompanied by photoreceptor loss, morphologic anomalies in photoreceptor cells and retinal reactive gliosis. Furthermore, 16S rRNA gene amplicon sequencing data showed a microbial gut dysbiosis with differences in alpha and beta diversity at the genera, species and amplicon sequence variants (ASV) levels between dystrophic and control mice. Remarkably, four fairly common ASV in healthy gut microbiome belonging to Rikenella spp., Muribaculaceace spp., Prevotellaceae UCG-001 spp., and Bacilli spp. were absent in the gut microbiome of retinal disease mice, while Bacteroides caecimuris was significantly enriched in mice with RP. The results indicate that retinal degenerative changes in RP are linked to relevant gut microbiome changes. The findings suggest that microbiome shifting could be considered as potential biomarker and therapeutic target for retinal degenerative diseases.


2018 ◽  
Author(s):  
Tamsen Dunn ◽  
Gwenn Berry ◽  
Dorothea Emig-Agius ◽  
Yu Jiang ◽  
Serena Lei ◽  
...  

AbstractMotivationNext-Generation Sequencing (NGS) technology is transitioning quickly from research labs to clinical settings. The diagnosis and treatment selection for many acquired and autosomal conditions necessitate a method for accurately detecting somatic and germline variants, suitable for the clinic.ResultsWe have developed Pisces, a rapid, versatile and accurate small variant calling suite designed for somatic and germline amplicon sequencing applications. Pisces accuracy is achieved by four distinct modules, the Pisces Read Stitcher, Pisces Variant Caller, the Pisces Variant Quality Recalibrator, and the Pisces Variant Phaser. Each module incorporates a number of novel algorithmic strategies aimed at reducing noise or increasing the likelihood of detecting a true variant.AvailabilityPisces is distributed under an open source license and can be downloaded from https://github.com/Illumina/Pisces. Pisces is available on the BaseSpace™ SequenceHub as part of the TruSeq Amplicon workflow and the Illumina Ampliseq Workflow. Pisces is distributed on Illumina sequencing platforms such as the MiSeq™, and is included in the Praxis™ Extended RAS Panel test which was recently approved by the FDA for the detection of multiple RAS gene [email protected] informationSupplementary data are available online.


2021 ◽  
Vol 12 ◽  
Author(s):  
Shen Yin ◽  
Xiaowei Zhan ◽  
Bo Yao ◽  
Guanghua Xiao ◽  
Xinlei Wang ◽  
...  

RNA-sequencing (RNA-seq) provides a comprehensive quantification of transcriptomic activities in biological samples. Formalin-Fixed Paraffin-Embedded (FFPE) samples are collected as part of routine clinical procedure, and are the most widely available biological sample format in medical research and patient care. Normalization is an essential step in RNA-seq data analysis. A number of normalization methods, though developed for RNA-seq data from fresh frozen (FF) samples, can be used with FFPE samples as well. The only extant normalization method specifically designed for FFPE RNA-seq data, MIXnorm, which has been shown to outperform the normalization methods, but at the cost of a complex mixture model and a high computational burden. It is therefore important to adapt MIXnorm for simplicity and computational efficiency while maintaining superior performance. Furthermore, it is critical to develop an integrated tool that performs commonly used normalization methods for both FF and FFPE RNA-seq data. We developed a new normalization method for FFPE RNA-seq data, named SMIXnorm, based on a simplified two-component mixture model compared to MIXnorm to facilitate computation. The expression levels of expressed genes are modeled by normal distributions without truncation, and those of non-expressed genes are modeled by zero-inflated Poisson distributions. The maximum likelihood estimates of the model parameters are obtained by a nested Expectation-Maximization algorithm with a less complicated latent variable structure, and closed-form updates are available within each iteration. Real data applications and simulation studies show that SMIXnorm greatly reduces computing time compared to MIXnorm, without sacrificing the performance. More importantly, we developed a web-based tool, RNA-seq Normalization (RSeqNorm), that offers a simple workflow to compute normalized RNA-seq data for both FFPE and FF samples. It includes SMIXnorm and MIXnorm for FFPE RNA-seq data, together with five commonly used normalization methods for FF RNA-seq data. Users can easily upload a raw RNA-seq count matrix and select one of the seven normalization methods to produce a downloadable normalized expression matrix for any downstream analysis. The R package is available at https://github.com/S-YIN/RSEQNORM. The web-based tool, RSeqNorm is available at http://lce.biohpc.swmed.edu/rseqnorm with no restriction to use or redistribute.


2021 ◽  
Vol 12 ◽  
Author(s):  
Annika Vaksmaa ◽  
Katrin Knittel ◽  
Alejandro Abdala Asbun ◽  
Maaike Goudriaan ◽  
Andreas Ellrott ◽  
...  

Plastic particles in the ocean are typically covered with microbial biofilms, but it remains unclear whether distinct microbial communities colonize different polymer types. In this study, we analyzed microbial communities forming biofilms on floating microplastics in a bay of the island of Elba in the Mediterranean Sea. Raman spectroscopy revealed that the plastic particles mainly comprised polyethylene (PE), polypropylene (PP), and polystyrene (PS) of which polyethylene and polypropylene particles were typically brittle and featured cracks. Fluorescence in situ hybridization and imaging by high-resolution microscopy revealed dense microbial biofilms on the polymer surfaces. Amplicon sequencing of the 16S rRNA gene showed that the bacterial communities on all plastic types consisted mainly of the orders Flavobacteriales, Rhodobacterales, Cytophagales, Rickettsiales, Alteromonadales, Chitinophagales, and Oceanospirillales. We found significant differences in the biofilm community composition on PE compared with PP and PS (on OTU and order level), which shows that different microbial communities colonize specific polymer types. Furthermore, the sequencing data also revealed a higher relative abundance of archaeal sequences on PS in comparison with PE or PP. We furthermore found a high occurrence, up to 17% of all sequences, of different hydrocarbon-degrading bacteria on all investigated plastic types. However, their functioning in the plastic-associated biofilm and potential role in plastic degradation needs further assessment.


2019 ◽  
Vol 35 (22) ◽  
pp. 4809-4811 ◽  
Author(s):  
Robert S Harris ◽  
Monika Cechova ◽  
Kateryna D Makova

Abstract Summary Tandem DNA repeats can be sequenced with long-read technologies, but cannot be accurately deciphered due to the lack of computational tools taking high error rates of these technologies into account. Here we introduce Noise-Cancelling Repeat Finder (NCRF) to uncover putative tandem repeats of specified motifs in noisy long reads produced by Pacific Biosciences and Oxford Nanopore sequencers. Using simulations, we validated the use of NCRF to locate tandem repeats with motifs of various lengths and demonstrated its superior performance as compared to two alternative tools. Using real human whole-genome sequencing data, NCRF identified long arrays of the (AATGG)n repeat involved in heat shock stress response. Availability and implementation NCRF is implemented in C, supported by several python scripts, and is available in bioconda and at https://github.com/makovalab-psu/NoiseCancellingRepeatFinder. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document