Docker4Circ: A Framework for the Reproducible Characterization of circRNAs from RNA-Seq Data

Giulio Ferrero; Nicola Licheri; Lucia Coscujuela Tarrero; Carlo De Intinis; Valentina Miano; Raffaele Adolfo Calogero; Francesca Cordero; Michele De Bortoli; Marco Beccuti

doi:10.3390/ijms21010293

Docker4Circ: A Framework for the Reproducible Characterization of circRNAs from RNA-Seq Data

International Journal of Molecular Sciences ◽

10.3390/ijms21010293 ◽

2019 ◽

Vol 21 (1) ◽

pp. 293 ◽

Cited By ~ 3

Author(s):

Giulio Ferrero ◽

Nicola Licheri ◽

Lucia Coscujuela Tarrero ◽

Carlo De Intinis ◽

Valentina Miano ◽

...

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Gene Expression Regulation ◽

Differential Expression Analysis ◽

Model Organisms ◽

Complete Analysis ◽

Rna Seq ◽

Sequence Reconstruction ◽

Reproducible Analysis ◽

User Friendly

Recent improvements in cost-effectiveness of high-throughput technologies has allowed RNA sequencing of total transcriptomes suitable for evaluating the expression and regulation of circRNAs, a relatively novel class of transcript isoforms with suggested roles in transcriptional and post-transcriptional gene expression regulation, as well as their possible use as biomarkers, due to their deregulation in various human diseases. A limited number of integrated workflows exists for prediction, characterization, and differential expression analysis of circRNAs, none of them complying with computational reproducibility requirements. We developed Docker4Circ for the complete analysis of circRNAs from RNA-Seq data. Docker4Circ runs a comprehensive analysis of circRNAs in human and model organisms, including: circRNAs prediction; classification and annotation using six public databases; back-splice sequence reconstruction; internal alternative splicing of circularizing exons; alignment-free circRNAs quantification from RNA-Seq reads; and differential expression analysis. Docker4Circ makes circRNAs analysis easier and more accessible thanks to: (i) its R interface; (ii) encapsulation of computational tasks into docker images; (iii) user-friendly Java GUI Interface availability; and (iv) no need of advanced bash scripting skills for correct use. Furthermore, Docker4Circ ensures a reproducible analysis since all its tasks are embedded into a docker image following the guidelines provided by Reproducible Bioinformatics Project.

Download Full-text

Mapping and differential expression analysis from short-read RNA-Seq data in model organisms

Quantitative Biology ◽

10.1007/s40484-016-0060-7 ◽

2016 ◽

Vol 4 (1) ◽

pp. 22-35 ◽

Cited By ~ 2

Author(s):

Qiong-Yi Zhao ◽

Jacob Gratten ◽

Restuadi Restuadi ◽

Xuan Li

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Model Organisms ◽

Rna Seq ◽

Short Read

Download Full-text

From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline

F1000Research ◽

10.12688/f1000research.8987.1 ◽

2016 ◽

Vol 5 ◽

pp. 1438 ◽

Cited By ~ 9

Author(s):

Yunshun Chen ◽

Aaron T. L. Lun ◽

Gordon K. Smyth

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Mouse Mammary Gland ◽

Complete Analysis ◽

Rna Seq ◽

R Software ◽

Software Packages ◽

Bioconductor Project ◽

Computational Workflow

In recent years, RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.

Download Full-text

Assembly-free rapid differential gene expression analysis in non-model organisms using DNA-protein alignment

10.1101/2021.04.23.441097 ◽

2021 ◽

Author(s):

Anish M.S. Shrestha ◽

Joyce Emlyn B. Guiao ◽

Kyle Christian R. Santiago

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Expression Analysis ◽

De Novo ◽

Transcriptome Assembly ◽

Differential Expression Analysis ◽

Homology Search ◽

Model Organisms ◽

Rna Seq ◽

Protein Database

AbstractRNA-seq is being increasingly adopted for gene expression studies in a panoply of non-model organisms, with applications spanning the fields of agriculture, aquaculture, ecology, and environment. Conventional differential expression analysis for organisms without reference sequences requires performing computationally expensive and error-prone de-novo transcriptome assembly, followed by homology search against a high-confidence protein database for functional annotation. We propose a shortcut, where we obtain counts for differential expression analysis by directly aligning RNA-seq reads to the protein database. Through experiments on simulated and real data, we show drastic reductions in run-time and memory usage, with no loss in accuracy. A Snakemake implementation of our workflow is available at:https://bitbucket.org/project_samar/samar

Download Full-text

Docker4Circ: A Framework for a Reproducible Characterization of CircRNAs from RNA-Seq Data

10.20944/preprints201907.0219.v1 ◽

2019 ◽

Author(s):

Giulio Ferrero ◽

Nicola Licheri ◽

Lucia Coscujuela Tarrero ◽

Carlo De Intinis ◽

Valentina Miano ◽

...

Keyword(s):

Differential Expression Analysis ◽

Circular Rnas ◽

Rna Seq ◽

Computational Framework ◽

Tumor Tissues ◽

Colorectal Cancer Cell Lines ◽

Sequence Reconstruction ◽

Reproducible Analysis ◽

User Friendly

Recently the increased cost-effectiveness of high-throughput technologies has made available a large number of RNA sequencing datasets to identify circular RNAs (circRNAs). However, despite many computational tools were developed to predict circRNAs, a limited number of workflows exists to predict and to characterize circRNAs. Moreover, to the best of our knowledge, these available workflows do not ensure computational reproducibility and require advanced bash scripting skills to be correctly installed and used. To cope with these critical aspects we present Docker4Circ, a new computational framework designed for a comprehensive analysis of circRNAs composed of: circRNAs prediction, classification and annotation using public databases, the back-splicing sequence reconstruction; the internal alternative splicing of circularizing exons; the alignment-free circRNAs quantification from RNA-Seq reads, and, finally, their differential expression analysis. Docker4Circ was specifically designed for making easier and more accessible circRNAs analysis thanks to the following features: (i) its R interface; (ii) the encapsulation of its computational tasks into a docker image; (iii) an available user-friendly Java GUI Interface. Furthermore, Docker4Circ ensures a reproducible analysis because all its tasks were embedded into a docker image following the guidelines provided by Reproducible Bioinformatics Project (RBP, http://reproducible-bioinformatics.org/). The effectiveness of Docker4Circ was demonstrated on a real case study whose goal is to characterize the circRNAs predicted in colorectal cancer cell lines and quantified in public RNA-Seq experiments performed on primary tumor tissues. In conclusion, we propose Docker4Circ as a framework for reproducible and comprehensive analyses of circRNAs to efficiently exploit their biological role.

Download Full-text

RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis

F1000Research ◽

10.12688/f1000research.54533.1 ◽

2021 ◽

Vol 10 ◽

pp. 654

Author(s):

Margaux Haering ◽

Bianca H Habermann

Keyword(s):

Time Series ◽

Time Series Analysis ◽

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Model Organisms ◽

Rna Seq ◽

Series Analysis ◽

Shiny App ◽

R Shiny

RNA sequencing (RNA-seq) is a widely adopted affordable method for large scale gene expression profiling. However, user-friendly and versatile tools for wet-lab biologists to analyse RNA-seq data beyond standard analyses such as differential expression, are rare. Especially, the analysis of time-series data is difficult for wet-lab biologists lacking advanced computational training. Furthermore, most meta-analysis tools are tailored for model organisms and not easily adaptable to other species. With RNfuzzyApp, we provide a user-friendly, web-based R shiny app for differential expression analysis, as well as time-series analysis of RNA-seq data. RNfuzzyApp offers several methods for normalization and differential expression analysis of RNA-seq data, providing easy-to-use toolboxes, interactive plots and downloadable results. For time-series analysis, RNfuzzyApp presents the first web-based, fully automated pipeline for soft clustering with the Mfuzz R package, including methods to aid in cluster number selection, cluster overlap analysis, Mfuzz loop computations, as well as cluster enrichments. RNfuzzyApp is an intuitive, easy to use and interactive R shiny app for RNA-seq differential expression and time-series analysis, offering a rich selection of interactive plots, providing a quick overview of raw data and generating rapid analysis results. Furthermore, its orthology assignment, enrichment analysis, as well as ID conversion functions are accessible to non-model organisms.

Download Full-text

DEAR-O: Differential Expression Analysis based on RNA-seq data - Online

10.1101/069807 ◽

2016 ◽

Author(s):

Zong-Hong Zhang ◽

Naomi R. Wray ◽

Qiong-Yi Zhao

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Online Discussion ◽

Differential Expression Analysis ◽

Rna Seq ◽

Timely Manner ◽

Link Type ◽

Online Discussion Forum ◽

User Friendly ◽

Transcriptomic Studies

AbstractSummaryDifferential expression analysis using high-throughput RNA sequencing (RNA-seq) data is widely applied in transcriptomic studies and many software tools have been developed for this purpose. Active development of existing popular tools, together with emergence of new tools means that studies comparing the performance of differential expression analysis methods become rapidly out-of-date. In order to enable researchers to evaluate new and updated software in a timely manner, we developed DEAR-O, a user-friendly platform for performance evaluation of differential expression analysis based on RNA-seq data. The platform currently includes four of the most popular tools: DESeq, DESeq2, edgeR and Cuffdiff2. Based on the DEAR-O platform, researchers can evaluate the performance of different tools, or the same tool with different versions, with a customised number of biological replicates using already curated RNA-seq datasets. We also initiated an online forum for discussion of RNA-seq differential expression analysis. Through this forum, new useful tools and benchmarking datasets can be introduced. Our platform will be actively maintained to ensure new major versions of existing tools and new popular tools are included. DEAR-O will serve the community by providing timely evaluations of tools, versions and number of replicates for RNA-seq differential expression analysis.Availability and implementationThe DEAR-O platform is available at http://cnsgenomics.com/software/dear-o; the online discussion forum is https://groups.google.com/d/forum/[email protected] and [email protected]

Download Full-text

RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis

F1000Research ◽

10.12688/f1000research.54533.2 ◽

2021 ◽

Vol 10 ◽

pp. 654

Author(s):

Margaux Haering ◽

Bianca H Habermann

Keyword(s):

Time Series ◽

Time Series Analysis ◽

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Model Organisms ◽

Rna Seq ◽

Series Analysis ◽

Shiny App ◽

R Shiny

RNA sequencing (RNA-seq) is a widely adopted affordable method for large scale gene expression profiling. However, user-friendly and versatile tools for wet-lab biologists to analyse RNA-seq data beyond standard analyses such as differential expression, are rare. Especially, the analysis of time-series data is difficult for wet-lab biologists lacking advanced computational training. Furthermore, most meta-analysis tools are tailored for model organisms and not easily adaptable to other species. With RNfuzzyApp, we provide a user-friendly, web-based R shiny app for differential expression analysis, as well as time-series analysis of RNA-seq data. RNfuzzyApp offers several methods for normalization and differential expression analysis of RNA-seq data, providing easy-to-use toolboxes, interactive plots and downloadable results. For time-series analysis, RNfuzzyApp presents the first web-based, fully automated pipeline for soft clustering with the Mfuzz R package, including methods to aid in cluster number selection, cluster overlap analysis, Mfuzz loop computations, as well as cluster enrichments. RNfuzzyApp is an intuitive, easy to use and interactive R shiny app for RNA-seq differential expression and time-series analysis, offering a rich selection of interactive plots, providing a quick overview of raw data and generating rapid analysis results. Furthermore, its assignment of orthologs, enrichment analysis, as well as ID conversion functions are accessible to non-model organisms.

Download Full-text

From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline

F1000Research ◽

10.12688/f1000research.8987.2 ◽

2016 ◽

Vol 5 ◽

pp. 1438 ◽

Cited By ~ 42

Author(s):

Yunshun Chen ◽

Aaron T. L. Lun ◽

Gordon K. Smyth

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Mouse Mammary Gland ◽

Complete Analysis ◽

Rna Seq ◽

R Software ◽

Software Packages ◽

Bioconductor Project ◽

Computational Workflow

Download Full-text

Best practices on the differential expression analysis of multi-species RNA-seq

Genome Biology ◽

10.1186/s13059-021-02337-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Matthew Chung ◽

Vincent M. Bruno ◽

David A. Rasko ◽

Christina A. Cuomo ◽

José F. Muñoz ◽

...

Keyword(s):

Best Practices ◽

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Single Species ◽

Rna Seq ◽

Species Analysis ◽

Differential Gene ◽

Multiple Species ◽

Downstream Analysis

AbstractAdvances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.

Download Full-text

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab028 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Xueyi Dong ◽

Luyi Tian ◽

Quentin Gouil ◽

Hasaru Kariyawasam ◽

Shian Su ◽

...

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Transcriptomic Analysis ◽

Statistical Testing ◽

Rna Seq ◽

Sequencing Data ◽

Short Read ◽

Sequencing Platform ◽

Long Read

Abstract Application of Oxford Nanopore Technologies’ long-read sequencing platform to transcriptomic analysis is increasing in popularity. However, such analysis can be challenging due to the high sequence error and small library sizes, which decreases quantification accuracy and reduces power for statistical testing. Here, we report the analysis of two nanopore RNA-seq datasets with the goal of obtaining gene- and isoform-level differential expression information. A dataset of synthetic, spliced, spike-in RNAs (‘sequins’) as well as a mouse neural stem cell dataset from samples with a null mutation of the epigenetic regulator Smchd1 was analysed using a mix of long-read specific tools for preprocessing together with established short-read RNA-seq methods for downstream analysis. We used limma-voom to perform differential gene expression analysis, and the novel FLAMES pipeline to perform isoform identification and quantification, followed by DRIMSeq and limma-diffSplice (with stageR) to perform differential transcript usage analysis. We compared results from the sequins dataset to the ground truth, and results of the mouse dataset to a previous short-read study on equivalent samples. Overall, our work shows that transcriptomic analysis of long-read nanopore data using long-read specific preprocessing methods together with short-read differential expression methods and software that are already in wide use can yield meaningful results.

Download Full-text