The Bench Scientist’s Guide to Statistical Analysis of RNA-Seq Data

Linnorm: improved statistical analysis for single cell RNA-seq expression data

Nucleic Acids Research ◽

10.1093/nar/gkx828 ◽

2017 ◽

Vol 45 (22) ◽

pp. e179-e179 ◽

Cited By ~ 38

Author(s):

Shun H. Yip ◽

Panwen Wang ◽

Jean-Pierre A. Kocher ◽

Pak Chung Sham ◽

Junwen Wang

Keyword(s):

Statistical Analysis ◽

Single Cell ◽

Expression Data ◽

Rna Seq

Download Full-text

Rail-RNA: Scalable analysis of RNA-seq splicing and coverage

10.1101/019067 ◽

2015 ◽

Cited By ~ 5

Author(s):

Abhinav Nellore ◽

Leonardo Collado-Torres ◽

Andrew E Jaffe ◽

José Alquicira-Hernández ◽

Jacob Pritt ◽

...

Keyword(s):

Statistical Analysis ◽

Web Services ◽

Open Source ◽

Rna Sequencing ◽

Open Source Software ◽

Rna Seq ◽

Spliced Alignment ◽

Amazon Web Services ◽

Scalable Analysis ◽

Multiple Samples

RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it is difficult to reproduce the exact analysis without access to original computing resources. We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16 hours for US$0.91 per sample. Rail-RNA produces alignments and base-resolution bigWig coverage files, ready for use with downstream packages for reproducible statistical analysis. We identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounders. Rail-RNA is open-source software available at http://rail.bio.

Download Full-text

TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data

Genomics & Informatics ◽

10.5808/gi.2017.15.1.51 ◽

2017 ◽

Vol 15 (1) ◽

pp. 51 ◽

Cited By ~ 6

Author(s):

Jae Hyun Lim ◽

Soo Youn Lee ◽

Ju Han Kim

Keyword(s):

Statistical Analysis ◽

R Package ◽

Rna Seq

Download Full-text

Linnorm: improved statistical analysis for single cell RNA-seq expression data

Nucleic Acids Research ◽

10.1093/nar/gkx1189 ◽

2017 ◽

Vol 45 (22) ◽

pp. 13097-13097 ◽

Cited By ~ 5

Author(s):

Shun H. Yip ◽

Panwen Wang ◽

Jean-Pierre A. Kocher ◽

Pak Chung Sham ◽

Junwen Wang

Keyword(s):

Statistical Analysis ◽

Single Cell ◽

Expression Data ◽

Rna Seq

Download Full-text

Integrative, normalization-insusceptible statistical analysis of RNA-Seq data, with improved differential expression and unbiased downstream functional analysis

Briefings in Bioinformatics ◽

10.1093/bib/bbaa156 ◽

2020 ◽

Author(s):

Dionysios Fanidis ◽

Panagiotis Moulos

Keyword(s):

Gene Expression ◽

Statistical Analysis ◽

Differential Expression ◽

Differential Gene Expression ◽

Expression Patterns ◽

Superior Performance ◽

P Value ◽

Rna Seq ◽

Daily Lives ◽

Differential Gene

Abstract The study of differential gene expression patterns through RNA-Seq comprises a routine task in the daily lives of molecular bioscientists, who produce vast amounts of data requiring proper management and analysis. Despite widespread use, there are still no widely accepted golden standards for the normalization and statistical analysis of RNA-Seq data, and critical biases, such as gene lengths and problems in the detection of certain types of molecules, remain largely unaddressed. Stimulated by these unmet needs and the lack of in-depth research into the potential of combinatorial methods to enhance the analysis of differential gene expression, we had previously introduced the PANDORA P-value combination algorithm while presenting evidence for PANDORA’s superior performance in optimizing the tradeoff between precision and sensitivity. In this article, we present the next generation of the algorithm along with a more in-depth investigation of its capabilities to effectively analyze RNA-Seq data. In particular, we show that PANDORA-reported lists of differentially expressed genes are unaffected by biases introduced by different normalization methods, while, at the same time, they comprise a reliable input option for downstream pathway analysis. Additionally, PANDORA outperforms other methods in detecting differential expression patterns in certain transcript types, including long non-coding RNAs.

Download Full-text

The bench scientist's guide to statistical analysis of RNA-Seq data

BMC Research Notes ◽

10.1186/1756-0500-5-506 ◽

2012 ◽

Vol 5 (1) ◽

Cited By ~ 26

Author(s):

Craig R Yendrek ◽

Elizabeth A Ainsworth ◽

Jyothi Thimmapuram

Keyword(s):

Statistical Analysis ◽

Rna Seq

Download Full-text

Differential Expression for RNA Sequencing (RNA-Seq) Data: Mapping, Summarization, Statistical Analysis, and Experimental Design

Bioinformatics for High Throughput Sequencing ◽

10.1007/978-1-4614-0782-9_10 ◽

2011 ◽

pp. 169-190 ◽

Cited By ~ 2

Author(s):

Matthew D. Young ◽

Davis J. McCarthy ◽

Matthew J. Wakefield ◽

Gordon K. Smyth ◽

Alicia Oshlack ◽

...

Keyword(s):

Statistical Analysis ◽

Experimental Design ◽

Differential Expression ◽

Rna Sequencing ◽

Data Mapping ◽

Rna Seq

Download Full-text

Statistical analysis of RNA-seq and scRNA-seq expression data

10.5353/th_991044069403703414 ◽

2018 ◽

Author(s):

Shun-hang Yip

Keyword(s):

Statistical Analysis ◽

Expression Data ◽

Rna Seq

Download Full-text

Statistical analysis of RNA-seq data from next-generation sequencing technology

10.31274/etd-180810-2620 ◽

2012 ◽

Author(s):

Yaqing Si

Keyword(s):

Statistical Analysis ◽

Next Generation Sequencing ◽

Rna Seq ◽

Next Generation ◽

Sequencing Technology ◽

Next Generation Sequencing Technology ◽

Generation Sequencing Technology ◽

Generation Sequencing

Download Full-text

Statistical analysis of classification

Symposium - International Astronomical Union ◽

10.1017/s0074180900053420 ◽

1966 ◽

Vol 24 ◽

pp. 188-189

Author(s):

T. J. Deeming

Keyword(s):

Multivariate Analysis ◽

Statistical Analysis ◽

Classification Scheme ◽

Analytical Procedure ◽

Narrow Band ◽

Scientific Research ◽

Maximum Amount ◽

Amount Of Information

If we make a set of measurements, such as narrow-band or multicolour photo-electric measurements, which are designed to improve a scheme of classification, and in particular if they are designed to extend the number of dimensions of classification, i.e. the number of classification parameters, then some important problems of analytical procedure arise. First, it is important not to reproduce the errors of the classification scheme which we are trying to improve. Second, when trying to extend the number of dimensions of classification we have little or nothing with which to test the validity of the new parameters.Problems similar to these have occurred in other areas of scientific research (notably psychology and education) and the branch of Statistics called Multivariate Analysis has been developed to deal with them. The techniques of this subject are largely unknown to astronomers, but, if carefully applied, they should at the very least ensure that the astronomer gets the maximum amount of information out of his data and does not waste his time looking for information which is not there. More optimistically, these techniques are potentially capable of indicating the number of classification parameters necessary and giving specific formulas for computing them, as well as pinpointing those particular measurements which are most crucial for determining the classification parameters.

Download Full-text