scholarly journals PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments

2019 ◽  
Author(s):  
Peng Liu ◽  
Alexandra A. Soukup ◽  
Emery H. Bresnick ◽  
Colin N. Dewey ◽  
Sündüz Keleş

AbstractPublicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by joint examination of large collections of RNA-seq datasets has emerged as one such analysis. Current methods for transcript discovery rely on a ‘2-Step’ approach where the first step encompasses building transcripts from individual datasets, followed by the second step that merges predicted transcripts across datasets. To increase the power of transcript discovery from large collections of RNA-seq datasets, we developed a novel ‘1-Step’ approach named Pooling RNA-seq and Assembling Models (PRAM) that builds transcript models from pooled RNA-seq datasets. We demonstrate in a computational benchmark that ‘1-Step’ outperforms ‘2-Step’ approaches in predicting overall transcript structures and individual splice junctions, while performing competitively in detecting exonic nucleotides. Applying PRAM to 30 human ENCODE RNA-seq datasets identified unannotated transcripts with epigenetic and RAMPAGE signatures similar to those of recently annotated transcripts. In a case study, we discovered and experimentally validated new transcripts through the application of PRAM to mouse hematopoietic RNA-seq datasets. Notably, we uncovered new transcripts that share a differential expression pattern with a neighboring genePik3cgimplicated in human hematopoietic phenotypes, and we provided evidence for the conservation of this relationship in human. PRAM is implemented as an R/Bioconductor package and is available athttps://bioconductor.org/packages/pram.

2017 ◽  
Vol 14 (4) ◽  
Author(s):  
Gökhan Karakülah

AbstractNovel transcript discovery through RNA sequencing has substantially improved our understanding of the transcriptome dynamics of biological systems. Endogenous target mimicry (eTM) transcripts, a novel class of regulatory molecules, bind to their target microRNAs (miRNAs) by base pairing and block their biological activity. The objective of this study was to provide a computational analysis framework for the prediction of putative eTM sequences in plants, and as an example, to discover previously un-annotated eTMs in Prunus persica (peach) transcriptome. Therefore, two public peach transcriptome libraries downloaded from Sequence Read Archive (SRA) and a previously published set of long non-coding RNAs (lncRNAs) were investigated with multi-step analysis pipeline, and 44 putative eTMs were found. Additionally, an eTM-miRNA-mRNA regulatory network module associated with peach fruit organ development was built via integration of the miRNA target information and predicted eTM-miRNA interactions. My findings suggest that one of the most widely expressed miRNA families among diverse plant species, miR156, might be potentially sponged by seven putative eTMs. Besides, the study indicates eTMs potentially play roles in the regulation of development processes in peach fruit via targeting specific miRNAs. In conclusion, by following the step-by step instructions provided in this study, novel eTMs can be identified and annotated effectively in public plant transcriptome libraries.


2016 ◽  
Author(s):  
Leonardo Collado-Torres ◽  
Abhinav Nellore ◽  
Kai Kammers ◽  
Shannon E. Ellis ◽  
Margaret A. Taub ◽  
...  

Abstractrecount is a resource of processed and summarized expression data spanning nearly 60,000 human RNA-seq samples from the Sequence Read Archive (SRA). The associated recount Bio-conductor package provides a convenient API for querying, downloading, and analyzing the data. Each processed study consists of meta/phenotype data, the expression levels of genes and their underlying exons and splice junctions, and corresponding genomic annotation. We also provide data summarization types for quantifying novel transcribed sequence including base-resolution coverage and potentially unannotated splice junctions. We present workflows illustrating how to use recount to perform differential expression analysis including meta-analysis, annotation-free base-level analysis, and replication of smaller studies using data from larger studies. recount provides a valuable and user-friendly resource of processed RNA-seq datasets to draw additional biological insights from existing public data. The resource is available at https://jhubiostatistics.shinyapps.io/recount/.


1996 ◽  
Vol 5 (1) ◽  
pp. 23-32 ◽  
Author(s):  
Chris Halpin ◽  
Barbara Herrmann ◽  
Margaret Whearty

The family described in this article provides an unusual opportunity to relate findings from genetic, histological, electrophysiological, psychophysical, and rehabilitative investigation. Although the total number evaluated is large (49), the known, living affected population is smaller (14), and these are spread from age 20 to age 59. As a result, the findings described above are those of a large-scale case study. Clearly, more data will be available through longitudinal study of the individuals documented in the course of this investigation but, given the slow nature of the progression in this disease, such studies will be undertaken after an interval of several years. The general picture presented to the audiologist who must rehabilitate these cases is that of a progressive cochlear degeneration that affects only thresholds at first, and then rapidly diminishes speech intelligibility. The expected result is that, after normal language development, the patient may accept hearing aids well, encouraged by the support of the family. Performance and satisfaction with the hearing aids is good, until the onset of the speech intelligibility loss, at which time the patient will encounter serious difficulties and may reject hearing aids as unhelpful. As the histological and electrophysiological results indicate, however, the eighth nerve remains viable, especially in the younger affected members, and success with cochlear implantation may be expected. Audiologic counseling efforts are aided by the presence of role models and support from the other affected members of the family. Speech-language pathology services were not considered important by the members of this family since their speech production developed normally and has remained very good. Self-correction of speech was supported by hearing aids and cochlear implants (Case 5’s speech production was documented in Perkell, Lane, Svirsky, & Webster, 1992). These patients received genetic counseling and, due to the high penetrance of the disease, exhibited serious concerns regarding future generations and the hope of a cure.


2008 ◽  
Author(s):  
D. L. McMullin ◽  
A. R. Jacobsen ◽  
D. C. Carvan ◽  
R. J. Gardner ◽  
J. A. Goegan ◽  
...  

Author(s):  
Lori Stahlbrand

This paper traces the partnership between the University of Toronto and the non-profit Local Food Plus (LFP) to bring local sustainable food to its St. George campus. At its launch, the partnership represented the largest purchase of local sustainable food at a Canadian university, as well as LFP’s first foray into supporting institutional procurement of local sustainable food. LFP was founded in 2005 with a vision to foster sustainable local food economies. To this end, LFP developed a certification system and a marketing program that matched certified farmers and processors to buyers. LFP emphasized large-scale purchases by public institutions. Using information from in-depth semi-structured key informant interviews, this paper argues that the LFP project was a disruptive innovation that posed a challenge to many dimensions of the established food system. The LFP case study reveals structural obstacles to operationalizing a local and sustainable food system. These include a lack of mid-sized infrastructure serving local farmers, the domination of a rebate system of purchasing controlled by an oligopolistic foodservice sector, and embedded government support of export agriculture. This case study is an example of praxis, as the author was the founder of LFP, as well as an academic researcher and analyst.


2020 ◽  
Vol 86 (7) ◽  
pp. 12-19
Author(s):  
I. V. Plyushchenko ◽  
D. G. Shakhmatov ◽  
I. A. Rodin

A viral development of statistical data processing, computing capabilities, chromatography-mass spectrometry, and omics technologies (technologies based on the achievements of genomics, transcriptomics, proteomics, metabolomics) in recent decades has not led to formation of a unified protocol for untargeted profiling. Systematic errors reduce the reproducibility and reliability of the obtained results, and at the same time hinder consolidation and analysis of data gained in large-scale multi-day experiments. We propose an algorithm for conducting omics profiling to identify potential markers in the samples of complex composition and present the case study of urine samples obtained from different clinical groups of patients. Profiling was carried out by the method of liquid chromatography mass spectrometry. The markers were selected using methods of multivariate analysis including machine learning and feature selection. Testing of the approach was performed using an independent dataset by clustering and projection on principal components.


Sign in / Sign up

Export Citation Format

Share Document