Comparison of RNA Isolation Methods on RNA-Seq: Implications for Differential Expression and Meta-Analyses

Mapping Intimacies ◽

10.1101/728014 ◽

2019 ◽

Cited By ~ 1

Author(s):

Amanda N. Scholes ◽

Jeffrey A. Lewis

Keyword(s):

Differential Expression ◽

Rna Isolation ◽

Rna Seq ◽

Phenol Extraction ◽

Isolation Methods ◽

Control Versus ◽

Specific Mrnas ◽

Meta Analyses ◽

Spurious Signals ◽

Versus Treatment

AbstractTechnical variation across different batches of RNA-seq experiments can clearly produce spurious signals of differential expression and reduce our power to detect true differences. Thus, it is important to identify major sources of these so-called “batch effects” to eliminate them from study design. Based on the different chemistries of “classic” phenol extraction of RNA compared to common commercial RNA isolation kits, we hypothesized that specific mRNAs may be preferentially extracted depending upon method, which could masquerade as differential expression in downstream RNA-seq analyses. We tested this hypothesis and found that phenol extraction preferentially isolated membrane-associated mRNAs, thus resulting in spurious signals of differential expression. Within a self-contained experimental batch (e.g. control versus treatment), the method of RNA isolation had little effect on the ability to identify differentially expressed transcripts. However, we suggest that researchers performing meta-analyses across different experimental batches strongly consider the RNA isolation methods for each experiment.

Download Full-text

Comparison of RNA isolation methods on RNA-Seq: implications for differential expression and meta-analyses

BMC Genomics ◽

10.1186/s12864-020-6673-2 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Amanda N. Scholes ◽

Jeffrey A. Lewis

Keyword(s):

Differential Expression ◽

Rna Isolation ◽

Rna Seq ◽

Isolation Methods ◽

Meta Analyses

Download Full-text

Faculty Opinions recommendation of How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726252558.793519625 ◽

2016 ◽

Author(s):

Christine Clayton

Keyword(s):

Differential Expression ◽

Rna Seq

Download Full-text

Best practices on the differential expression analysis of multi-species RNA-seq

Genome Biology ◽

10.1186/s13059-021-02337-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Matthew Chung ◽

Vincent M. Bruno ◽

David A. Rasko ◽

Christina A. Cuomo ◽

José F. Muñoz ◽

...

Keyword(s):

Best Practices ◽

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Single Species ◽

Rna Seq ◽

Species Analysis ◽

Differential Gene ◽

Multiple Species ◽

Downstream Analysis

AbstractAdvances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.

Download Full-text

High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis

Human Genomics ◽

10.1186/s40246-021-00308-5 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Weitong Cui ◽

Huaru Xue ◽

Lei Wei ◽

Jinghua Jin ◽

Xuewen Tian ◽

...

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Small Sample ◽

Differentially Expressed ◽

Cancer Type ◽

Rna Seq ◽

Sample Sizes ◽

Large Sample ◽

Expression Levels ◽

Gene Expression Levels

Abstract Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. Results Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. Conclusions High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.

Download Full-text

How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?

RNA ◽

10.1261/rna.053959.115 ◽

2016 ◽

Vol 22 (6) ◽

pp. 839-851 ◽

Cited By ~ 301

Author(s):

Nicholas J. Schurch ◽

Pietá Schofield ◽

Marek Gierliński ◽

Christian Cole ◽

Alexander Sherstnev ◽

...

Keyword(s):

Differential Expression ◽

Rna Seq

Download Full-text

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab028 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Xueyi Dong ◽

Luyi Tian ◽

Quentin Gouil ◽

Hasaru Kariyawasam ◽

Shian Su ◽

...

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Transcriptomic Analysis ◽

Statistical Testing ◽

Rna Seq ◽

Sequencing Data ◽

Short Read ◽

Sequencing Platform ◽

Long Read

Abstract Application of Oxford Nanopore Technologies’ long-read sequencing platform to transcriptomic analysis is increasing in popularity. However, such analysis can be challenging due to the high sequence error and small library sizes, which decreases quantification accuracy and reduces power for statistical testing. Here, we report the analysis of two nanopore RNA-seq datasets with the goal of obtaining gene- and isoform-level differential expression information. A dataset of synthetic, spliced, spike-in RNAs (‘sequins’) as well as a mouse neural stem cell dataset from samples with a null mutation of the epigenetic regulator Smchd1 was analysed using a mix of long-read specific tools for preprocessing together with established short-read RNA-seq methods for downstream analysis. We used limma-voom to perform differential gene expression analysis, and the novel FLAMES pipeline to perform isoform identification and quantification, followed by DRIMSeq and limma-diffSplice (with stageR) to perform differential transcript usage analysis. We compared results from the sequins dataset to the ground truth, and results of the mouse dataset to a previous short-read study on equivalent samples. Overall, our work shows that transcriptomic analysis of long-read nanopore data using long-read specific preprocessing methods together with short-read differential expression methods and software that are already in wide use can yield meaningful results.

Download Full-text

Overcoming confounding plate effects in differential expression analyses of single-cell RNA-seq data

Biostatistics ◽

10.1093/biostatistics/kxw055 ◽

2017 ◽

Vol 18 (3) ◽

pp. 451-464 ◽

Cited By ~ 40

Author(s):

Aaron T. L. Lun ◽

John C. Marioni

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Seq

Download Full-text

A comparison of combined p-value methods for gene differential expression using RNA-seq data

Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics - BCB '14 ◽

10.1145/2649387.2649421 ◽

2014 ◽

Author(s):

Abdallah M. Eteleeb ◽

Hunter N. Moseley ◽

Eric C. Rouchka

Keyword(s):

Differential Expression ◽

P Value ◽

Rna Seq ◽

Gene Differential Expression

Download Full-text

Survey of Methods Used for Differential Expression Analysis on RNA Seq Data

Learning and Analytics in Intelligent Systems - Biologically Inspired Techniques in Many-Criteria Decision Making ◽

10.1007/978-3-030-39033-4_21 ◽

2020 ◽

pp. 226-239

Author(s):

Reema Joshi ◽

Rosy Sarmah

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Rna Seq

Download Full-text

A comprehensive simulation study on classification of RNA-Seq data

10.7287/peerj.preprints.2761 ◽

2017 ◽

Author(s):

Gokmen Zararsiz ◽

Dinçer Göksülük ◽

Selçuk Korkmaz ◽

Vahap Eldem ◽

Gözde Ertürk Zararsız ◽

...

Keyword(s):

Discriminant Analysis ◽

Sample Size ◽

Linear Discriminant Analysis ◽

Differential Expression ◽

Simulation Study ◽

Rna Seq ◽

Linear Discriminant ◽

Number Of Genes ◽

Expression Rate

RNA sequencing (RNA-Seq) is a powerful technique for thegene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies.Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of geneexpression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data hierarchically closer to microarrays and apply microarray-based classifiers.In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such asoverdispersion, sample size, number of genes, number of classes, differential-expression rate, andthe transformation method on model performances.A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate, and number of genes and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM clas sifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html .

Download Full-text