SPORTS1.0: A Tool for Annotating and Profiling Non-coding RNAs Optimized for rRNA- and tRNA-derived Small RNAs

Mapping Intimacies ◽

10.1101/296970 ◽

2018 ◽

Author(s):

Junchao Shi ◽

Eun-A Ko ◽

Kenton M. Sanders ◽

Qi Chen ◽

Tong Zhou

Keyword(s):

Cell Types ◽

Mouse Cell ◽

Rapid Expansion ◽

Rna Modification ◽

Rna Seq ◽

Additional Species ◽

Wide Range ◽

Non Coding Rnas ◽

Nucleolar Rna ◽

Mismatch Rate

AbstractHigh-throughput RNA-seq has revolutionized the process of small RNA (sRNA) discovery, leading to a rapid expansion of sRNA categories. In addition to the previously well-characterized sRNAs such as microRNAs (miRNAs), Piwi-interacting RNA (piRNAs), and small nucleolar RNA (snoRNAs), recent emerging studies have spotlighted on tRNA-derived sRNAs (tsRNAs) and rRNA-derived sRNAs (rsRNAs) as new categories of sRNAs that bear versatile functions. Since existing software and pipelines for sRNA annotation are mostly focused on analyzing miRNAs or piRNAs, here we developed the sRNA annotation pipeline optimized for rRNA- and tRNA- derived sRNAs (SPORTS1.0). SPORTS1.0 is optimized for analyzing tsRNAs and rsRNAs from sRNA-seq data, in addition to its capacity to annotate canonical sRNAs such as miRNAs and piRNAs. Moreover, SPORTS1.0 can predict potential RNA modification sites based on nucleotide mismatches within sRNAs. SPORTS1.0 is precompiled to annotate sRNAs for a wide range of 68 species across bacteria, yeast, plant, and animal kingdoms, while additional species for analyses could be readily expanded upon end users’ input. For demonstration, by analyzing sRNA datasets using SPORTS1.0, we reveal that distinct signatures are present in tsRNAs and rsRNAs from different mouse cell types. We also find that compared to other sRNA species, tsRNAs bear the highest mismatch rate which is consistent with their highly modified nature. SPORTS1.0 is an open-source software and can be publically accessed at https://github.com/junchaoshi/sports1.0.

Download Full-text

RNA-Seq Data-Mining Allows the Discovery of Two Long Non-Coding RNA Biomarkers of Viral Infection in Humans

International Journal of Molecular Sciences ◽

10.3390/ijms21082748 ◽

2020 ◽

Vol 21 (8) ◽

pp. 2748 ◽

Cited By ~ 1

Author(s):

Ruth Barral-Arca ◽

Alberto Gómez-Carballa ◽

Miriam Cebey-López ◽

María José Currás-Tuala ◽

Sara Pischedda ◽

...

Keyword(s):

Gene Expression ◽

Viral Infections ◽

Umbilical Vein ◽

Cell Types ◽

Dermal Fibroblasts ◽

Learning Approaches ◽

Rna Seq ◽

Wide Range ◽

Healthy Control ◽

Umbilical Vein Endothelial Cells

There is a growing interest in unraveling gene expression mechanisms leading to viral host invasion and infection progression. Current findings reveal that long non-coding RNAs (lncRNAs) are implicated in the regulation of the immune system by influencing gene expression through a wide range of mechanisms. By mining whole-transcriptome shotgun sequencing (RNA-seq) data using machine learning approaches, we detected two lncRNAs (ENSG00000254680 and ENSG00000273149) that are downregulated in a wide range of viral infections and different cell types, including blood monocluclear cells, umbilical vein endothelial cells, and dermal fibroblasts. The efficiency of these two lncRNAs was positively validated in different viral phenotypic scenarios. These two lncRNAs showed a strong downregulation in virus-infected patients when compared to healthy control transcriptomes, indicating that these biomarkers are promising targets for infection diagnosis. To the best of our knowledge, this is the very first study using host lncRNAs biomarkers for the diagnosis of human viral infections.

Download Full-text

Comparison of Poly-A+ Selection and rRNA Depletion in Detection of lncRNA in Two Equine Tissues Using RNA-seq

Non-Coding RNA ◽

10.3390/ncrna6030032 ◽

2020 ◽

Vol 6 (3) ◽

pp. 32 ◽

Cited By ~ 1

Author(s):

Anna R. Dahlgren ◽

Erica Y. Scott ◽

Tamer Mansour ◽

Erin N. Hales ◽

Pablo J. Ross ◽

...

Keyword(s):

Ribosomal Rna ◽

Preparation Method ◽

Parietal Lobe ◽

Library Preparation ◽

Rna Seq ◽

The Past ◽

Rrna Depletion ◽

Non Coding Rnas ◽

Nucleolar Rna ◽

Library Preparation Method

Long non-coding RNAs (lncRNAs) are untranslated regulatory transcripts longer than 200 nucleotides that can play a role in transcriptional, post-translational, and epigenetic regulation. Traditionally, RNA-sequencing (RNA-seq) libraries have been created by isolating transcriptomic RNA via poly-A+ selection. In the past 10 years, methods to perform ribosomal RNA (rRNA) depletion of total RNA have been developed as an alternative, aiming for better coverage of whole transcriptomic RNA, both polyadenylated and non-polyadenylated transcripts. The purpose of this study was to determine which library preparation method is optimal for lncRNA investigations in the horse. Using liver and cerebral parietal lobe tissues from two healthy Thoroughbred mares, RNA-seq libraries were prepared using standard poly-A+ selection and rRNA-depletion methods. Averaging the two biologic replicates, poly-A+ selection yielded 327 and 773 more unique lncRNA transcripts for liver and parietal lobe, respectively. More lncRNA were found to be unique to poly-A+ selected libraries, and rRNA-depletion identified small nucleolar RNA (snoRNA) to have a higher relative expression than in the poly-A+ selected libraries. Overall, poly-A+ selection provides a more thorough identification of total lncRNA in equine tissues while rRNA-depletion may allow for easier detection of snoRNAs.

Download Full-text

A Review on the Role of Small Nucleolar RNA Host Gene 6 Long Non-coding RNAs in the Carcinogenic Processes

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.741684 ◽

2021 ◽

Vol 9 ◽

Author(s):

Soudeh Ghafouri-Fard ◽

Tayyebeh Khoshbakht ◽

Mohammad Taheri ◽

Seyedpouzhia Shojaei

Keyword(s):

Splice Variants ◽

Cell Types ◽

Host Gene ◽

Small Nucleolar Rna ◽

Non Coding Rnas ◽

Almost All ◽

Nucleolar Rna ◽

Tgf Β1

Being located on 17q25.1, small nucleolar RNA host gene 6 (SNHG16) is a member of SNHG family of long non-coding RNAs (lncRNA) with 4 exons and 13 splice variants. This lncRNA serves as a sponge for a variety of miRNAs, namely miR-520a-3p, miR-4500, miR-146a miR-16–5p, miR-98, let-7a-5p, hsa-miR-93, miR-17-5p, miR-186, miR-302a-3p, miR-605-3p, miR-140-5p, miR-195, let-7b-5p, miR-16, miR-340, miR-1301, miR-205, miR-488, miR-1285-3p, miR-146a-5p, and miR-124-3p. This lncRNA can affect activity of TGF-β1/SMAD5, mTOR, NF-κB, Wnt, RAS/RAF/MEK/ERK and PI3K/AKT pathways. Almost all studies have reported oncogenic effect of SNHG16 in diverse cell types. Here, we explain the results of studies about the oncogenic role of SNHG16 according to three distinct sets of evidence, i.e., in vitro, animal, and clinical evidence.

Download Full-text

ASAP: A web-based platform for the analysis and interactive visualization of single-cell RNA-seq data

10.1101/096222 ◽

2016 ◽

Cited By ~ 5

Author(s):

Vincent Gardeux ◽

Fabrice David ◽

Adrian Shajkofci ◽

Petra C Schwalie ◽

Bart Deplancke

Keyword(s):

Single Cell ◽

Single Cell Analysis ◽

Transcriptome Profiling ◽

Cell Types ◽

Complete Analysis ◽

Marker Genes ◽

Specific Marker ◽

Rna Seq ◽

Web Based ◽

Wide Range

AbstractMotivationSingle-cell RNA-sequencing (scRNA-seq) allows whole transcriptome profiling of thousands of individual cells, enabling the molecular exploration of tissues at the cellular level. Such analytical capacity is of great interest to many research groups in the world, yet, these groups often lack the expertise to handle complex scRNA-seq data sets.ResultsWe developed a fully integrated, web-based platform aimed at the complete analysis of scRNA-seq data post genome alignment: from the parsing, filtering, and normalization of the input count data files, to the visual representation of the data, identification of cell clusters, differentially expressed genes (including cluster-specific marker genes), and functional gene set enrichment. This Automated Single-cell Analysis Pipeline (ASAP) combines a wide range of commonly used algorithms with sophisticated visualization tools. Compared with existing scRNA-seq analysis platforms, researchers (including those lacking computational expertise) are able to interact with the data in a straightforward fashion and in real time. Furthermore, given the overlap between scRNA-seq and bulk RNA-seq analysis workflows, ASAP should conceptually be broadly applicable to any RNA-seq dataset. As a validation, we demonstrate how we can use ASAP to simply reproduce the results from a single-cell study of 91 mouse cells involving five distinct cell types.AvailabilityThe tool is freely available at http://[email protected]

Download Full-text

scTIM: seeking cell-type-indicative marker from single cell RNA-seq data by consensus optimization

Bioinformatics ◽

10.1093/bioinformatics/btz936 ◽

2019 ◽

Vol 36 (8) ◽

pp. 2474-2485 ◽

Cited By ~ 2

Author(s):

Zhanying Feng ◽

Xianwen Ren ◽

Yuan Fang ◽

Yining Yin ◽

Chutian Huang ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Cell Types ◽

Mouse Cell ◽

Supplementary Information ◽

Rna Seq ◽

Cell Type ◽

Robust Solution ◽

Development Trajectory ◽

Consensus Optimization

Abstract Motivation Single cell RNA-seq data offers us new resource and resolution to study cell type identity and its conversion. However, data analyses are challenging in dealing with noise, sparsity and poor annotation at single cell resolution. Detecting cell-type-indicative markers is promising to help denoising, clustering and cell type annotation. Results We developed a new method, scTIM, to reveal cell-type-indicative markers. scTIM is based on a multi-objective optimization framework to simultaneously maximize gene specificity by considering gene-cell relationship, maximize gene’s ability to reconstruct cell–cell relationship and minimize gene redundancy by considering gene–gene relationship. Furthermore, consensus optimization is introduced for robust solution. Experimental results on three diverse single cell RNA-seq datasets show scTIM’s advantages in identifying cell types (clustering), annotating cell types and reconstructing cell development trajectory. Applying scTIM to the large-scale mouse cell atlas data identifies critical markers for 15 tissues as ‘mouse cell marker atlas’, which allows us to investigate identities of different tissues and subtle cell types within a tissue. scTIM will serve as a useful method for single cell RNA-seq data mining. Availability and implementation scTIM is freely available at https://github.com/Frank-Orwell/scTIM. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

SAME-clustering: Single-cell Aggregated Clustering via Mixture Model Ensemble

10.1101/645820 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ruth Huh ◽

Yuchen Yang ◽

Yuchao Jiang ◽

Yin Shen ◽

Yun Li

Keyword(s):

Single Cell ◽

Mixture Model ◽

Single Cells ◽

Cell Types ◽

Rna Seq ◽

Model Ensemble ◽

Clustering Ensemble ◽

Number Of Clusters ◽

Wide Range ◽

Level Cluster

ABSTRACTClustering is an essential step in the analysis of single cell RNA-seq (scRNA-seq) data to shed light on tissue complexity including the number of cell types and transcriptomic signatures of each cell type. Due to its importance, novel methods have been developed recently for this purpose. However, different approaches generate varying estimates regarding the number of clusters and the single-cell level cluster assignments. This type of unsupervised clustering is challenging and it is often times hard to gauge which method to use because none of the existing methods outperform others across all scenarios. We present SAME-clustering, a mixture model-based approach that takes clustering solutions from multiple methods and selects a maximally diverse subset to produce an improved ensemble solution. We tested SAME-clustering across 15 scRNA-seq datasets generated by different platforms, with number of clusters varying from 3 to 15, and number of single cells from 49 to 32,695. Results show that our SAME-clustering ensemble method yields enhanced clustering, in terms of both cluster assignments and number of clusters. The mixture model ensemble clustering is not limited to clustering scRNA-seq data and may be useful to a wide range of clustering applications.

Download Full-text

SAME-clustering: Single-cell Aggregated Clustering via Mixture Model Ensemble

Nucleic Acids Research ◽

10.1093/nar/gkz959 ◽

2019 ◽

Vol 48 (1) ◽

pp. 86-95 ◽

Cited By ~ 5

Author(s):

Ruth Huh ◽

Yuchen Yang ◽

Yuchao Jiang ◽

Yin Shen ◽

Yun Li

Keyword(s):

Single Cell ◽

Mixture Model ◽

Single Cells ◽

Cell Types ◽

Rna Seq ◽

Model Ensemble ◽

Clustering Ensemble ◽

Number Of Clusters ◽

Wide Range ◽

Level Cluster

Abstract Clustering is an essential step in the analysis of single cell RNA-seq (scRNA-seq) data to shed light on tissue complexity including the number of cell types and transcriptomic signatures of each cell type. Due to its importance, novel methods have been developed recently for this purpose. However, different approaches generate varying estimates regarding the number of clusters and the single-cell level cluster assignments. This type of unsupervised clustering is challenging and it is often times hard to gauge which method to use because none of the existing methods outperform others across all scenarios. We present SAME-clustering, a mixture model-based approach that takes clustering solutions from multiple methods and selects a maximally diverse subset to produce an improved ensemble solution. We tested SAME-clustering across 15 scRNA-seq datasets generated by different platforms, with number of clusters varying from 3 to 15, and number of single cells from 49 to 32 695. Results show that our SAME-clustering ensemble method yields enhanced clustering, in terms of both cluster assignments and number of clusters. The mixture model ensemble clustering is not limited to clustering scRNA-seq data and may be useful to a wide range of clustering applications.

Download Full-text

A versatile system to record cell-cell interactions

eLife ◽

10.7554/elife.61080 ◽

2020 ◽

Vol 9 ◽

Author(s):

Rui Tang ◽

Christopher W Murray ◽

Ian L Linde ◽

Nicholas J Kramer ◽

Zhonglin Lyu ◽

...

Keyword(s):

Fluorescent Protein ◽

Cell Types ◽

Mouse Cell ◽

Cell Interactions ◽

Physical Contact ◽

Cell Labeling ◽

Wide Range ◽

Human And Mouse ◽

Cell Cell

Cell-cell interactions influence all aspects of development, homeostasis, and disease. In cancer, interactions between cancer cells and stromal cells play a major role in nearly every step of carcinogenesis. Thus, the ability to record cell-cell interactions would facilitate mechanistic delineation of the role of the cancer microenvironment. Here, we describe GFP-based Touching Nexus (G-baToN) which relies upon nanobody-directed fluorescent protein transfer to enable sensitive and specific labeling of cells after cell-cell interactions. G-baToN is a generalizable system that enables physical contact-based labeling between various human and mouse cell types, including endothelial cell-pericyte, neuron-astrocyte, and diverse cancer-stromal cell pairs. A suite of orthogonal baToN tools enables reciprocal cell-cell labeling, interaction-dependent cargo transfer, and the identification of higher order cell-cell interactions across a wide range of cell types. The ability to track physically interacting cells with these simple and sensitive systems will greatly accelerate our understanding of the outputs of cell-cell interactions in cancer as well as across many biological processes.

Download Full-text

Massively parallel RNA device engineering in mammalian cells with RNA-Seq

Nature Communications ◽

10.1038/s41467-019-12334-y ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 10

Author(s):

Joy S. Xiang ◽

Matias Kaplan ◽

Peter Dykstra ◽

Michaela Hinks ◽

Maureen McKeague ◽

...

Keyword(s):

Mammalian Cells ◽

Cell Types ◽

Massively Parallel ◽

Rna Seq ◽

High Activation ◽

Conserved Sequence ◽

Regulatory Processes ◽

Basal Expression ◽

Wide Range ◽

Gene Regulatory

Abstract Synthetic RNA-based genetic devices dynamically control a wide range of gene-regulatory processes across diverse cell types. However, the limited throughput of quantitative assays in mammalian cells has hindered fast iteration and interrogation of sequence space needed to identify new RNA devices. Here we report developing a quantitative, rapid and high-throughput mammalian cell-based RNA-Seq assay to efficiently engineer RNA devices. We identify new ribozyme-based RNA devices that respond to theophylline, hypoxanthine, cyclic-di-GMP, and folinic acid from libraries of ~22,700 sequences in total. The small molecule responsive devices exhibit low basal expression and high activation ratios, significantly expanding our toolset of highly functional ribozyme switches. The large datasets obtained further provide conserved sequence and structure motifs that may be used for rationally guided design. The RNA-Seq approach offers a generally applicable strategy for developing broad classes of RNA devices, thereby advancing the engineering of genetic devices for mammalian systems.

Download Full-text

PRMdb: A Repository of Predicted RNA Modifications in Plants

Plant and Cell Physiology ◽

10.1093/pcp/pcaa042 ◽

2020 ◽

Vol 61 (6) ◽

pp. 1213-1222

Author(s):

Xuan Ma ◽

Fuyan Si ◽

Xiaonan Liu ◽

Weijiang Luan

Keyword(s):

Plant Species ◽

Posttranscriptional Regulation ◽

Regulation Of Gene Expression ◽

Rna Modification ◽

Rna Seq ◽

Rna Modifications ◽

High Throughput Analysis ◽

Functional Studies ◽

Web Resource ◽

Wide Range

Abstract Evidence is mounting that RNA modifications play essential roles in posttranscriptional regulation of gene expression. So far, over 150 RNA modifications catalyzed by distinct enzymes have been documented. In plants, genome-wide identification of RNA modifications is largely limited to the model species Arabidopsis thaliana, while lacking in diverse non-model plants. Here, we present PRMdb, a plant RNA modification database, based on the analysis of thousands of RNA-seq, degradome-seq and small RNA-seq data from a wide range of plant species using the well-documented tool HAMR (high-throughput analysis of modified ribonucleotide). PRMdb provides a user-friendly interface that enables easy browsing and searching of the tRNA and mRNA modification data. We show that PRMdb collects high-confidence RNA modifications including novel RNA modification sites that can be validated by genomic PCR and reverse transcription PCR. In summary, PRMdb provides a valuable web resource for deciphering the epitranscriptomes in diverse plant species and will facilitate functional studies of RNA modifications in plants. RPMdb is available via http://www.biosequencing.cn/PRMdb/.

Download Full-text