scholarly journals Comprehensive analysis of RNA-sequencing to find the source of 1 trillion reads across diverse adult human tissues

2016 ◽  
Author(s):  
Serghei Mangul ◽  
Harry Taegyun Yang ◽  
Nicolas Strauli ◽  
Franziska Gruhl ◽  
Hagit T. Porath ◽  
...  

AbstractHigh throughput RNA sequencing technologies have provided invaluable research opportunities across distinct scientific domains by producing quantitative readouts of the transcriptional activity of both entire cellular populations and single cells. The majority of RNA-Seq analyses begin by mapping each experimentally produced sequence (i.e., read) to a set of annotated reference sequences for the organism of interest. For both biological and technical reasons, a significant fraction of reads remains unmapped. In this work, we develop Read Origin Protocol (ROP) to discover the source of all reads originating from complex RNA molecules, recombinant T and B cell receptors, and microbial communities. We applied ROP to 8,641 samples across 630 individuals from 54 tissues. A fraction of RNA-Seq data (n=86) was obtained in-house; the remaining data was obtained from the Genotype-Tissue Expression (GTEx v6) project. To generalize the reported number of accounted reads, we also performed ROP analysis on thousands of different, randomly selected, and publicly available RNA-Seq samples in the Sequence Read Archive (SRA). Our approach can account for 99.9% of 1 trillion reads of various read length across the merged dataset (n=10641). Using in-house RNA-Seq data, we show that immune profiles of asthmatic individuals are significantly different from the profiles of control individuals, with decreased average per sample T and B cell receptor diversity. We also show that immune diversity is inversely correlated with microbial load. Our results demonstrate the potential of ROP to exploit unmapped reads in order to better understand the functional mechanisms underlying connections between the immune system, microbiome, human gene expression, and disease etiology. ROP is freely available athttps://github.com/smangul1/ropand currently supports human and mouse RNA-Seq reads.

2019 ◽  
Vol 2 (4) ◽  
pp. e201900371 ◽  
Author(s):  
Shaked Afik ◽  
Gabriel Raulet ◽  
Nir Yosef

RNA sequencing of single B cells provides simultaneous measurements of the cell state and its antigen specificity as determined by the B-cell receptor (BCR). However, to uncover the latter, further reconstruction of the BCR sequence is needed. We present BRAPeS (“BCR Reconstruction Algorithm for Paired-end Single cells” ), an algorithm for reconstructing BCRs from short-read paired-end single-cell RNA sequencing. BRAPeS is accurate and achieves a high success rate even at very short (25 bp) read length, which can decrease the cost and increase the number of cells that can be analyzed compared with long reads. BRAPeS is publicly available at the following link: https://github.com/YosefLab/BRAPeS.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Shadi Darvish Shafighi ◽  
Szymon M. Kiełbasa ◽  
Julieta Sepúlveda-Yáñez ◽  
Ramin Monajemi ◽  
Davy Cats ◽  
...  

Abstract Background Drawing genotype-to-phenotype maps in tumors is of paramount importance for understanding tumor heterogeneity. Assignment of single cells to their tumor clones of origin can be approached by matching the genotypes of the clones to the mutations found in RNA sequencing of the cells. The confidence of the cell-to-clone mapping can be increased by accounting for additional measurements. Follicular lymphoma, a malignancy of mature B cells that continuously acquire mutations in parallel in the exome and in B cell receptor loci, presents a unique opportunity to join exome-derived mutations with B cell receptor sequences as independent sources of evidence for clonal evolution. Methods Here, we propose CACTUS, a probabilistic model that leverages the information from an independent genomic clustering of cells and exploits the scarce single cell RNA sequencing data to map single cells to given imperfect genotypes of tumor clones. Results We apply CACTUS to two follicular lymphoma patient samples, integrating three measurements: whole exome, single-cell RNA, and B cell receptor sequencing. CACTUS outperforms a predecessor model by confidently assigning cells and B cell receptor-based clusters to the tumor clones. Conclusions The integration of independent measurements increases model certainty and is the key to improving model performance in the challenging task of charting the genotype-to-phenotype maps in tumors. CACTUS opens the avenue to study the functional implications of tumor heterogeneity, and origins of resistance to targeted therapies. CACTUS is written in R and source code, along with all supporting files, are available on GitHub (https://github.com/LUMC/CACTUS).


2018 ◽  
Author(s):  
Shaked Afik ◽  
Gabriel Raulet ◽  
Nir Yosef

ABSTRACTRNA-sequencing of single B cells provides simultaneous measurements of the cell state and its binding specificity. However, in order to uncover the latter further reconstruction of the B cell receptor (BCR) sequence is needed. We present BRAPeS, an algorithm for reconstructing BCRs from short-read paired-end single cell RNA-sequencing. BRAPeS is accurate and achieves a high success rate even at very short (25bp) read length, which can decrease the cost and increase the number of cells that can be analyzed compared to long reads. BRAPeS is publicly available in the following link: https://github.com/YosefLab/BRAPeS.


2020 ◽  
Author(s):  
Shadi Darvish Shafighi ◽  
Szymon M Kiełbasa ◽  
Julieta Sepúlveda-Yáñez ◽  
Ramin Monajemi ◽  
Davy Cats ◽  
...  

ABSTRACTBackgroundDrawing genotype-to-phenotype maps in tumors is of paramount importance for understanding tumor heterogeneity. Assignment of single cells to their tumor clones of origin can be approached by matching the genotypes of the clones to the mutations found in RNA sequencing of the cells. The confidence of the cell-to-clone mapping can be increased by accounting for additional measurements. Follicular lymphoma, a malignancy of mature B cells that continuously acquire mutations in parallel in the exome and in B-cell receptor loci, presents a unique opportunity to align exome-derived mutations with B-cell receptor clonotypes as an independent measure for clonal evolution.ResultsHere, we propose CACTUS, a probabilistic model that leverages the information from an independent genomic clustering of cells and exploits the scarce single cell RNA sequencing data to map single cells to given imperfect genotypes of tumor clones. We apply CACTUS to two follicular lymphoma patient samples, integrating three measurements: whole exome sequencing, single cell RNA sequencing, and B-cell receptor sequencing. CACTUS outperforms a predecessor model by confidently assigning cells and B-cell receptor clonotypes to the tumor clones.ConclusionsThe integration of independent measurements increases model certainty and is the key to improving model performance in the challenging task of charting the genotype-to-phenotype maps in tumors. CACTUS opens the avenue to study the functional implications of tumor heterogeneity, and origins of resistance to targeted therapies.


2021 ◽  
Author(s):  
Ram Ayyala ◽  
Junghyun Jung ◽  
Sergey Knyazev ◽  
SERGHEI MANGUL

Although precise identification of the human leukocyte antigen (HLA) allele is crucial for various clinical and research applications, HLA typing remains challenging due to high polymorphism of the HLA loci. However, with Next-Generation Sequencing (NGS) data becoming widely accessible, many computational tools have been developed to predict HLA types from RNA sequencing (RNA-seq) data. However, there is a lack of comprehensive and systematic benchmarking of RNA-seq HLA callers using large-scale and realist gold standards. In order to address this limitation, we rigorously compared the performance of 12 HLA callers over 50,000 HLA tasks including searching 30 pairwise combinations of HLA callers and reference in over 1,500 samples. In each case, we produced evaluation metrics of accuracy that is the percentage of correctly predicted alleles (two and four-digit resolution) based on six gold standard datasets spanning 650 RNA-seq samples. To determine the influence of the relationship of the read length over the HLA region on prediction quality using each tool, we explored the read length effect by considering read length in the range 37-126 bp, which was available in our gold standard datasets. Moreover, using the Genotype-Tissue Expression (GTEx) v8 data, we carried out evaluation metrics by calculating the concordance of the same HLA type across different tissues from the same individual to evaluate how well the HLA callers can maintain consistent results across various tissues of the same individual. This study offers crucial information for researchers regarding appropriate choices of methods for an HLA analysis.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 964
Author(s):  
Sarka Benesova ◽  
Mikael Kubista ◽  
Lukas Valihrach

MicroRNAs (miRNAs) are a class of small RNA molecules that have an important regulatory role in multiple physiological and pathological processes. Their disease-specific profiles and presence in biofluids are properties that enable miRNAs to be employed as non-invasive biomarkers. In the past decades, several methods have been developed for miRNA analysis, including small RNA sequencing (RNA-seq). Small RNA-seq enables genome-wide profiling and analysis of known, as well as novel, miRNA variants. Moreover, its high sensitivity allows for profiling of low input samples such as liquid biopsies, which have now found applications in diagnostics and prognostics. Still, due to technical bias and the limited ability to capture the true miRNA representation, its potential remains unfulfilled. The introduction of many new small RNA-seq approaches that tried to minimize this bias, has led to the existence of the many small RNA-seq protocols seen today. Here, we review all current approaches to cDNA library construction used during the small RNA-seq workflow, with particular focus on their implementation in commercially available protocols. We provide an overview of each protocol and discuss their applicability. We also review recent benchmarking studies comparing each protocol’s performance and summarize the major conclusions that can be gathered from their usage. The result documents variable performance of the protocols and highlights their different applications in miRNA research. Taken together, our review provides a comprehensive overview of all the current small RNA-seq approaches, summarizes their strengths and weaknesses, and provides guidelines for their applications in miRNA research.


2017 ◽  
Vol 64 (4) ◽  
pp. 476-481 ◽  
Author(s):  
Jerome Bouquet ◽  
Jennifer L. Gardy ◽  
Scott Brown ◽  
Jacob Pfeil ◽  
Ruth R. Miller ◽  
...  

2018 ◽  
Vol 15 (8) ◽  
pp. 563-565 ◽  
Author(s):  
Ida Lindeman ◽  
Guy Emerton ◽  
Lira Mamanova ◽  
Omri Snir ◽  
Krzysztof Polanski ◽  
...  

2020 ◽  
Vol 21 (10) ◽  
pp. 3711
Author(s):  
Melina J. Sedano ◽  
Alana L. Harrison ◽  
Mina Zilaie ◽  
Chandrima Das ◽  
Ramesh Choudhari ◽  
...  

Genome-wide RNA sequencing has shown that only a small fraction of the human genome is transcribed into protein-coding mRNAs. While once thought to be “junk” DNA, recent findings indicate that the rest of the genome encodes many types of non-coding RNA molecules with a myriad of functions still being determined. Among the non-coding RNAs, long non-coding RNAs (lncRNA) and enhancer RNAs (eRNA) are found to be most copious. While their exact biological functions and mechanisms of action are currently unknown, technologies such as next-generation RNA sequencing (RNA-seq) and global nuclear run-on sequencing (GRO-seq) have begun deciphering their expression patterns and biological significance. In addition to their identification, it has been shown that the expression of long non-coding RNAs and enhancer RNAs can vary due to spatial, temporal, developmental, or hormonal variations. In this review, we explore newly reported information on estrogen-regulated eRNAs and lncRNAs and their associated biological functions to help outline their markedly prominent roles in estrogen-dependent signaling.


2016 ◽  
Vol 32 (24) ◽  
pp. 3729-3734 ◽  
Author(s):  
Lisle E. Mose ◽  
Sara R. Selitsky ◽  
Lisa M. Bixby ◽  
David L. Marron ◽  
Michael D. Iglesia ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document