Fast and interpretable alternative splicing and differential gene-level expression analysis using transcriptome segmentation with Yanagi

Mapping Intimacies ◽

10.1101/364281 ◽

2018 ◽

Author(s):

Mohamed K Gunady ◽

Stephen M Mount ◽

Héctor Corrada Bravo

Keyword(s):

Gene Expression ◽

Alternative Splicing ◽

Expression Analysis ◽

Homo Sapiens ◽

Small Subset ◽

Rna Seq ◽

Multiple Transcripts ◽

Alignment Algorithms ◽

Gene Level ◽

Comparable Performance

AbstractIntroduction:Analysis of differential alternative splicing from RNA-seq data is complicated by the fact that many RNA-seq reads map to multiple transcripts, besides, the annotated transcripts are often a small subset of the possible transcripts of a gene. Here we describe Yanagi, a tool for segmenting transcriptome to create a library of maximal L-disjoint segments from a complete transcriptome annotation. That segment library preserves all transcriptome substrings of length L and transcripts structural relationships while eliminating unnecessary sequence duplications.Contributions:In this paper, we formalize the concept of transcriptome segmentation and propose an efficient algorithm for generating segment libraries based on a length parameter dependent on specific RNA-Seq library construction. The resulting segment sequences can be used with pseudo-alignment tools to quantify expression at the segment level. We characterize the segment libraries for the reference transcriptomes of Drosophila melanogaster and Homo sapiens and provide gene-level visualization of the segments for better interpretability. Then we demonstrate the use of segments-level quantification into gene expression and alternative splicing analysis. The notion of transcript segmentation as introduced here and implemented in Yanagi opens the door for the application of lightweight, ultra-fast pseudo-alignment algorithms in a wide variety of RNA-seq analyses.Conclusion:Using segment library rather than the standard transcriptome succeeds in significantly reducing ambigious alignments where reads are multimapped to several sequences in the reference. That allowed avoiding the quantification step required by standard kmer-based pipelines for gene expression analysis. Moreover, using segment counts as statistics for alternative splicing analysis enables achieving comparable performance to counting-based approaches (e.g. rMATS) while rather using fast and lighthweight pseudo alignment.

Download Full-text

Plant buffering against the high-light stress-induced accumulation of CsGA2ox8 transcripts via alternative splicing to finely tune gibberellin levels and maintain hypocotyl elongation

Horticulture Research ◽

10.1038/s41438-020-00430-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Bin Liu ◽

Shuo Zhao ◽

Pengli Li ◽

Yilu Yin ◽

Qingliang Niu ◽

...

Keyword(s):

Alternative Splicing ◽

Light Intensity ◽

Intron Retention ◽

Light Stress ◽

Photosynthetic Photon Flux Density ◽

Photon Flux ◽

Photon Flux Density ◽

Rna Seq ◽

Multiple Transcripts ◽

Novel Transcript

AbstractIn plants, alternative splicing (AS) is markedly induced in response to environmental stresses, but it is unclear why plants generate multiple transcripts under stress conditions. In this study, RNA-seq was performed to identify AS events in cucumber seedlings grown under different light intensities. We identified a novel transcript of the gibberellin (GA)-deactivating enzyme Gibberellin 2-beta-dioxygenase 8 (CsGA2ox8). Compared with canonical CsGA2ox8.1, the CsGA2ox8.2 isoform presented intron retention between the second and third exons. Functional analysis proved that the transcript of CsGA2ox8.1 but not CsGA2ox8.2 played a role in the deactivation of bioactive GAs. Moreover, expression analysis demonstrated that both transcripts were upregulated by increased light intensity, but the expression level of CsGA2ox8.1 increased slowly when the light intensity was >400 µmol·m−2·s−1 PPFD (photosynthetic photon flux density), while the CsGA2ox8.2 transcript levels increased rapidly when the light intensity was >200 µmol·m−2·s−1 PPFD. Our findings provide evidence that plants might finely tune their GA levels by buffering against the normal transcripts of CsGA2ox8 through AS.

Download Full-text

Multi-species transcriptome meta-analysis of the response to retinoic acid in vertebrates and comparative analysis of the effects of retinol and retinoic acid on gene expression in LMH cells

BMC Genomics ◽

10.1186/s12864-021-07451-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Clemens Falker-Gieske ◽

Andrea Mott ◽

Sören Franzenburg ◽

Jens Tetens

Keyword(s):

Gene Expression ◽

Retinoic Acid ◽

Cellular Response ◽

Homo Sapiens ◽

Meta Analysis ◽

Early Response ◽

Rna Seq ◽

Transcriptomic Response ◽

Time Points ◽

Lmh Cells

Abstract Background Retinol (RO) and its active metabolite retinoic acid (RA) are major regulators of gene expression in vertebrates and influence various processes like organ development, cell differentiation, and immune response. To characterize a general transcriptomic response to RA-exposure in vertebrates, independent of species- and tissue-specific effects, four publicly available RNA-Seq datasets from Homo sapiens, Mus musculus, and Xenopus laevis were analyzed. To increase species and cell-type diversity we generated RNA-seq data with chicken hepatocellular carcinoma (LMH) cells. Additionally, we compared the response of LMH cells to RA and RO at different time points. Results By conducting a transcriptome meta-analysis, we identified three retinoic acid response core clusters (RARCCs) consisting of 27 interacting proteins, seven of which have not been associated with retinoids yet. Comparison of the transcriptional response of LMH cells to RO and RA exposure at different time points led to the identification of non-coding RNAs (ncRNAs) that are only differentially expressed (DE) during the early response. Conclusions We propose that these RARCCs stand on top of a common regulatory RA hierarchy among vertebrates. Based on the protein sets included in these clusters we were able to identify an RA-response cluster, a control center type cluster, and a cluster that directs cell proliferation. Concerning the comparison of the cellular response to RA and RO we conclude that ncRNAs play an underestimated role in retinoid-mediated gene regulation.

Download Full-text

Genome-wide gene expression analysis of amphioxus (Branchiostoma belcheri) following lipopolysaccharide challenge using strand-specific RNA-seq

RNA Biology ◽

10.1080/15476286.2017.1367890 ◽

2017 ◽

Vol 14 (12) ◽

pp. 1799-1809 ◽

Cited By ~ 10

Author(s):

Qi-Lin Zhang ◽

Qian-Hua Zhu ◽

Zheng-Qing Xie ◽

Bin Xu ◽

Xiu-Qiang Wang ◽

...

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Rna Seq ◽

Genome Wide ◽

Branchiostoma Belcheri ◽

Genome Wide Gene Expression ◽

Lipopolysaccharide Challenge

Download Full-text

Yanagi: Fast and interpretable segment-based alternative splicing and gene expression analysis

BMC Bioinformatics ◽

10.1186/s12859-019-2947-6 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Mohamed K Gunady ◽

Stephen M Mount ◽

Héctor Corrada Bravo

Keyword(s):

Gene Expression ◽

Alternative Splicing ◽

Expression Analysis ◽

Gene Expression Analysis

Download Full-text

RNAflow: An Effective and Simple RNA-Seq Differential Gene Expression Pipeline Using Nextflow

Genes ◽

10.3390/genes11121487 ◽

2020 ◽

Vol 11 (12) ◽

pp. 1487

Author(s):

Marie Lataretu ◽

Martin Hölzer

Keyword(s):

Gene Expression ◽

Homo Sapiens ◽

Standard Technique ◽

Common Species ◽

Rna Seq ◽

Rna Molecules ◽

Gene Filtering ◽

Differential Gene ◽

High Level ◽

Very High

RNA-Seq enables the identification and quantification of RNA molecules, often with the aim of detecting differentially expressed genes (DEGs). Although RNA-Seq evolved into a standard technique, there is no universal gold standard for these data’s computational analysis. On top of that, previous studies proved the irreproducibility of RNA-Seq studies. Here, we present a portable, scalable, and parallelizable Nextflow RNA-Seq pipeline to detect DEGs, which assures a high level of reproducibility. The pipeline automatically takes care of common pitfalls, such as ribosomal RNA removal and low abundance gene filtering. Apart from various visualizations for the DEG results, we incorporated downstream pathway analysis for common species as Homo sapiens and Mus musculus. We evaluated the DEG detection functionality while using qRT-PCR data serving as a reference and observed a very high correlation of the logarithmized gene expression fold changes.

Download Full-text

Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data

Genome Biology ◽

10.1186/gb-2013-14-9-r95 ◽

2013 ◽

Vol 14 (9) ◽

pp. R95 ◽

Cited By ~ 408

Author(s):

Franck Rapaport ◽

Raya Khanin ◽

Yupu Liang ◽

Mono Pirun ◽

Azra Krek ◽

...

Keyword(s):

Gene Expression ◽

Differential Gene Expression ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Comprehensive Evaluation ◽

Rna Seq ◽

Differential Gene Expression Analysis ◽

Analysis Methods ◽

Differential Gene

Download Full-text

NGS-based targeted RNA sequencing for expression analysis of patients with triple-negative breast cancer using a modulized, 96-gene biomarker panel.

Journal of Clinical Oncology ◽

10.1200/jco.2012.30.30_suppl.56 ◽

2012 ◽

Vol 30 (30_suppl) ◽

pp. 56-56

Author(s):

Byung-In Lee ◽

Kahuku Oades ◽

Lien Vo ◽

Jerry Lee ◽

Mark Landers ◽

...

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Sequence Analysis ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Triple Negative ◽

Dynamic Range ◽

Rna Seq ◽

Single Tube ◽

Qrt Pcr

56 Background: Gene expression profiling has been shown to be effective in analyzing postoperative tumor samples in various cancers. However, in analyzing small specimens such as core biopsies, the limited amount of available material makes multi-gene analyses difficult or impossible. Microarray-based analyses also provide limited dynamic range. We describe the development of targeted RNA-sequencing methodology which combines the power of a universal RNA amplification with NGS for an ultra-deep expression analysis of multiple target genes, enabling <100 ng of sample input for multi-gene analysis in a single tube format. Methods: The gene expression patterns of triple-negative breast cancer FFPE samples were analyzed using a 96-gene breast cancer biomarker panel across three different platforms: Affymetrix Human Gene ST 1.0 microarrays, a pre-developed OncoScore qRT-PCR panel, and targeted RNA-seq. For targeted RNA-seq analysis, the 96-gene panel was amplified using a universal, single-tube “XP-PCR” amplification strategy followed by sequence analysis using the Ion-Torrent Personal Genome Machine. Results: Targeted RNA-seq provided the most sensitivity in terms of detection rates with <100 ng FFPE RNA input and provides unlimited dynamic range with increased sequencing depth. Expression ratio compression issues typically associated with a high number of pre-amplification cycles in standard multiplex-primed methods were not observed here. Low expressing genes, undetectable by qRT-PCR analysis from 1,000 ng input FFPE RNA, were detected and eligible for expression analysis with a significant number of sequencing reads. Alternative transcription/splicing analysis is also possible from sequence analysis of the target transcripts using targeted RNA-seq. Conclusions: By combining universally primed pre-amplification and NGS in multi-gene expression analysis, targeted RNA-seq provides the most sensitive gene expression analysis methodology.

Download Full-text

Evaluation of two public genome references for chinese hamster ovary cells in the context of rna-seq based gene expression analysis

Biotechnology and Bioengineering ◽

10.1002/bit.26290 ◽

2017 ◽

Vol 114 (7) ◽

pp. 1603-1613 ◽

Cited By ~ 9

Author(s):

Chun Chen ◽

Huong Le ◽

Chetan T. Goudar

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Chinese Hamster Ovary ◽

Gene Expression Analysis ◽

Chinese Hamster Ovary Cells ◽

Chinese Hamster ◽

Rna Seq ◽

Ovary Cells

Download Full-text

Dysregulation of Splicing in Multiple Myeloma: The Splicing Factor SRSF1 Supports MM Cell Proliferation Via Splicing Control

Blood ◽

10.1182/blood-2018-99-118845 ◽

2018 ◽

Vol 132 (Supplement 1) ◽

pp. 4500-4500

Author(s):

Mariateresa Fulciniti ◽

Michael A Lopez ◽

Anil Aktas Samur ◽

Eugenio Morelli ◽

Hervé Avet-Loiseau ◽

...

Keyword(s):

Gene Expression ◽

Multiple Myeloma ◽

Alternative Splicing ◽

Board Of Directors ◽

Research Funding ◽

Splicing Factor ◽

Splicing Factors ◽

Rna Seq ◽

Advisory Committees ◽

Disease Biology

Abstract Gene expression profile has provided interesting insights into the disease biology, helped develop new risk stratification, and identify novel druggable targets in multiple myeloma (MM). However, there is significant impact of alternative pre-mRNA splicing (AS) as one of the key transcriptome modifier. These spliced variants increases the transcriptomic complexity and its misregulation affect disease behavior impacting therapeutic consideration in various disease processes including cancer. Our large well annotated deep RNA sequencing data from purified MM cells data from 420 newly-diagnosed patients treated homogeneously have identified 1534 genes with one or more splicing events observed in at least 10% or more patients. Median alternative splicing event per patient was 595 (range 223 - 2735). These observed global alternative splicing events in MM involves aberrant splicing of critical growth and survival genes affects the disease biology as well as overall survival. Moreover, the decrease of cell viability observed in a large panel of MM cell lines after inhibition of splicing at the pre-mRNA complex and stalling at the A complex confirmed that MM cells are exquisitely sensitive to pharmacological inhibition of splicing. Based on these data, we further focused on understanding the molecular mechanisms driving aberrant alternative splicing in MM. An increasing body of evidence indicates that altered expression of regulatory splicing factors (SF) can have oncogenic properties by impacting AS of cancer-associated genes. We used our large RNA-seq dataset to create a genome wide global alterations map of SF and identified several splicing factors significantly dysregulated in MM compared to normal plasma cells with impact on clinical outcome. The splicing factor Serine and Arginine Rich Splicing Factor 1 (SRSF1), regulating initiation of spliceosome assembly, was selected for further evaluation, as its impact on clinical outcome was confirmed in two additional independent myeloma datasets. In gain-of (GOF) studies enforced expression of SRSF1 in MM cells significantly increased proliferation, especially in the presence of bone marrow stromal cells; and conversely, in loss-of function (LOF) studies, downregulation of SRSF1, using stable or doxy-inducible shRNA systems significantly inhibited MM cell proliferation and survival over time. We utilized SRSF1 mutants to dissect the mechanisms involved in the SRSF1-mediated MM growth induction, and observed that the growth promoting effect of SRSF1 in MM cells was mainly due to its splicing activity. We next investigated the impact of SRSF1 on allelic isoforms of specific gene targets by RNA-seq in LOF and confirmed in GOF studies. Splicing profiles showed widespread changes in AS induced by SRSF1 knock down. The most recurrent splicing events were skipped exon (SE) and alternative first (AF) exon splicing as compared to control cells. SE splice events were primarily upregulated and AF splice events were evenly upregulated and downregulated. Genes in which splicing events in these categories occurred mostly did not show significant difference in overall gene expression level when compared to control, following SRSF1 depletion. When analyzing cellular functions of SRSF1-regulated splicing events, we found that SRSF1 knock down affects genes in the RNA processing pathway as well as genes involved in cancer-related functions such as mTOR and MYC-related pathways. Splicing analysis was corroborated with immunoprecipitation (IP) followed by mass spectrometry (MS) analysis of T7-tagged SRSF1 MM cells. We have observed increased levels of SRSF phosphorylation, which regulates it's subcellular localization and activity, in MM cell lines and primary patient MM cells compared to normal donor PBMCs. Moreover, we evaluated the chemical compound TG003, an inhibitor of Cdc2-like kinase (CLK) 1 and 4 that regulate splicing by fine-tuning the phosphorylation of SR proteins. Treatment with TG003 decreased SRSF1 phosphorylation preventing the spliceosome assembly and inducing a dose dependent inhibition of MM cell viability. In conclusions, here we provide mechanistic insights into myeloma-related splicing dysregulation and establish SRSF1 as a tumor promoting gene with therapeutic potential. Disclosures Avet-Loiseau: Janssen: Consultancy, Membership on an entity's Board of Directors or advisory committees; Celgene: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Sanofi: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Abbvie: Membership on an entity's Board of Directors or advisory committees; Amgen: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Takeda: Membership on an entity's Board of Directors or advisory committees, Research Funding. Munshi:OncoPep: Other: Board of director.

Download Full-text

ABioTrans: A Biostatistical tool for Transcriptomics Analysis

10.1101/616300 ◽

2019 ◽

Author(s):

Zou Yutong ◽

Bui Thuy Tien ◽

Kumar Selvarajoo

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Gene Expression Omnibus ◽

Rna Seq ◽

Distribution Fitting ◽

Web Browser ◽

Genome Wide ◽

Data Files ◽

R Packages

AbstractHere we report a bio-statistical/informatics tool, ABioTrans, developed in R for gene expression analysis. The tool allows the user to directly read RNA-Seq data files deposited in the Gene Expression Omnibus or GEO database. Operated using any web browser application, ABioTrans provides easy options for multiple statistical distribution fitting, Pearson and Spearman rank correlations, PCA, k-means and hierarchical clustering, differential expression analysis, Shannon entropy and noise (square of coefficient of variation) analyses, as well as Gene ontology classifications.Availability and implementationABioTrans is available at https://github.com/buithuytien/ABioTransOperating system(s): Platform independent (web browser)Programming language: R (R studio)Other requirements: Bioconductor genome wide annotation databases, R-packages (shiny, LSD, fitdistrplus, actuar, entropy, moments, RUVSeq, edgeR, DESeq2, NOISeq, AnnotationDbi, ComplexHeatmap, circlize, clusterProfiler, reshape2, DT, plotly, shinycssloaders, dplyr, ggplot2). These packages will automatically be installed when the ABioTrans.R is executed in R studio.No restriction of usage for non-academic.

Download Full-text