scholarly journals The Bench Scientist’s Guide to Statistical Analysis of RNA-Seq Data

2014 ◽  
pp. 1-20
Author(s):  
Craig Yendrek ◽  
Elizabeth Ainsworth ◽  
Jyothi Thimmapuram
Keyword(s):  
2017 ◽  
Vol 45 (22) ◽  
pp. e179-e179 ◽  
Author(s):  
Shun H. Yip ◽  
Panwen Wang ◽  
Jean-Pierre A. Kocher ◽  
Pak Chung Sham ◽  
Junwen Wang

2015 ◽  
Author(s):  
Abhinav Nellore ◽  
Leonardo Collado-Torres ◽  
Andrew E Jaffe ◽  
José Alquicira-Hernández ◽  
Jacob Pritt ◽  
...  

RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it is difficult to reproduce the exact analysis without access to original computing resources. We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16 hours for US$0.91 per sample. Rail-RNA produces alignments and base-resolution bigWig coverage files, ready for use with downstream packages for reproducible statistical analysis. We identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounders. Rail-RNA is open-source software available at http://rail.bio.


2017 ◽  
Vol 15 (1) ◽  
pp. 51 ◽  
Author(s):  
Jae Hyun Lim ◽  
Soo Youn Lee ◽  
Ju Han Kim

2017 ◽  
Vol 45 (22) ◽  
pp. 13097-13097 ◽  
Author(s):  
Shun H. Yip ◽  
Panwen Wang ◽  
Jean-Pierre A. Kocher ◽  
Pak Chung Sham ◽  
Junwen Wang

Author(s):  
Dionysios Fanidis ◽  
Panagiotis Moulos

Abstract The study of differential gene expression patterns through RNA-Seq comprises a routine task in the daily lives of molecular bioscientists, who produce vast amounts of data requiring proper management and analysis. Despite widespread use, there are still no widely accepted golden standards for the normalization and statistical analysis of RNA-Seq data, and critical biases, such as gene lengths and problems in the detection of certain types of molecules, remain largely unaddressed. Stimulated by these unmet needs and the lack of in-depth research into the potential of combinatorial methods to enhance the analysis of differential gene expression, we had previously introduced the PANDORA P-value combination algorithm while presenting evidence for PANDORA’s superior performance in optimizing the tradeoff between precision and sensitivity. In this article, we present the next generation of the algorithm along with a more in-depth investigation of its capabilities to effectively analyze RNA-Seq data. In particular, we show that PANDORA-reported lists of differentially expressed genes are unaffected by biases introduced by different normalization methods, while, at the same time, they comprise a reliable input option for downstream pathway analysis. Additionally, PANDORA outperforms other methods in detecting differential expression patterns in certain transcript types, including long non-coding RNAs.


2012 ◽  
Vol 5 (1) ◽  
Author(s):  
Craig R Yendrek ◽  
Elizabeth A Ainsworth ◽  
Jyothi Thimmapuram
Keyword(s):  

Author(s):  
Matthew D. Young ◽  
Davis J. McCarthy ◽  
Matthew J. Wakefield ◽  
Gordon K. Smyth ◽  
Alicia Oshlack ◽  
...  

1966 ◽  
Vol 24 ◽  
pp. 188-189
Author(s):  
T. J. Deeming

If we make a set of measurements, such as narrow-band or multicolour photo-electric measurements, which are designed to improve a scheme of classification, and in particular if they are designed to extend the number of dimensions of classification, i.e. the number of classification parameters, then some important problems of analytical procedure arise. First, it is important not to reproduce the errors of the classification scheme which we are trying to improve. Second, when trying to extend the number of dimensions of classification we have little or nothing with which to test the validity of the new parameters.Problems similar to these have occurred in other areas of scientific research (notably psychology and education) and the branch of Statistics called Multivariate Analysis has been developed to deal with them. The techniques of this subject are largely unknown to astronomers, but, if carefully applied, they should at the very least ensure that the astronomer gets the maximum amount of information out of his data and does not waste his time looking for information which is not there. More optimistically, these techniques are potentially capable of indicating the number of classification parameters necessary and giving specific formulas for computing them, as well as pinpointing those particular measurements which are most crucial for determining the classification parameters.


Sign in / Sign up

Export Citation Format

Share Document