Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq

Becky C. Carlyle; Robert R. Kitchen; Jing Zhang; Rashaun S. Wilson; Tukiet T. Lam; Joel S. Rozowsky; Kenneth R. Williams; Nenad Sestan; Mark B. Gerstein; Angus C. Nairn

doi:10.1021/acs.jproteome.8b00310

Advancing clinical genomics and precision medicine with GVViZ: FAIR bioinformatics platform for variable gene-disease annotation, visualization, and expression analysis

Human Genomics ◽

10.1186/s40246-021-00336-1 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Zeeshan Ahmed ◽

Eduard Gibert Renart ◽

Saman Zeeshan ◽

XinQi Dong

Keyword(s):

Data Analysis ◽

Patient Care ◽

Expression Analysis ◽

High Throughput ◽

Gene Annotation ◽

Next Generation Sequencing Data ◽

Rna Seq ◽

Sequencing Data ◽

Complex Disorders ◽

Transcriptomics Data

Abstract Background Genetic disposition is considered critical for identifying subjects at high risk for disease development. Investigating disease-causing and high and low expressed genes can support finding the root causes of uncertainties in patient care. However, independent and timely high-throughput next-generation sequencing data analysis is still a challenge for non-computational biologists and geneticists. Results In this manuscript, we present a findable, accessible, interactive, and reusable (FAIR) bioinformatics platform, i.e., GVViZ (visualizing genes with disease-causing variants). GVViZ is a user-friendly, cross-platform, and database application for RNA-seq-driven variable and complex gene-disease data annotation and expression analysis with a dynamic heat map visualization. GVViZ has the potential to find patterns across millions of features and extract actionable information, which can support the early detection of complex disorders and the development of new therapies for personalized patient care. The execution of GVViZ is based on a set of simple instructions that users without a computational background can follow to design and perform customized data analysis. It can assimilate patients’ transcriptomics data with the public, proprietary, and our in-house developed gene-disease databases to query, easily explore, and access information on gene annotation and classified disease phenotypes with greater visibility and customization. To test its performance and understand the clinical and scientific impact of GVViZ, we present GVViZ analysis for different chronic diseases and conditions, including Alzheimer’s disease, arthritis, asthma, diabetes mellitus, heart failure, hypertension, obesity, osteoporosis, and multiple cancer disorders. The results are visualized using GVViZ and can be exported as image (PNF/TIFF) and text (CSV) files that include gene names, Ensembl (ENSG) IDs, quantified abundances, expressed transcript lengths, and annotated oncology and non-oncology diseases. Conclusions We emphasize that automated and interactive visualization should be an indispensable component of modern RNA-seq analysis, which is currently not the case. However, experts in clinics and researchers in life sciences can use GVViZ to visualize and interpret the transcriptomics data, making it a powerful tool to study the dynamics of gene expression and regulation. Furthermore, with successful deployment in clinical settings, GVViZ has the potential to enable high-throughput correlations between patient diagnoses based on clinical and transcriptomics data.

Download Full-text

Systematic comparison of high-throughput single-cell RNA-seq methods for immune cell profiling

BMC Genomics ◽

10.1186/s12864-020-07358-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Tracy M. Yamawaki ◽

Daniel R. Lu ◽

Daniel C. Ellwanger ◽

Dev Bhatt ◽

Paolo Manzanillo ◽

...

Keyword(s):

Single Cell ◽

High Throughput ◽

Immune Cell ◽

Cell Types ◽

Data Interpretation ◽

Detection Sensitivity ◽

Rna Seq ◽

Cell Recovery

Abstract Background Elucidation of immune populations with single-cell RNA-seq has greatly benefited the field of immunology by deepening the characterization of immune heterogeneity and leading to the discovery of new subtypes. However, single-cell methods inherently suffer from limitations in the recovery of complete transcriptomes due to the prevalence of cellular and transcriptional dropout events. This issue is often compounded by limited sample availability and limited prior knowledge of heterogeneity, which can confound data interpretation. Results Here, we systematically benchmarked seven high-throughput single-cell RNA-seq methods. We prepared 21 libraries under identical conditions of a defined mixture of two human and two murine lymphocyte cell lines, simulating heterogeneity across immune-cell types and cell sizes. We evaluated methods by their cell recovery rate, library efficiency, sensitivity, and ability to recover expression signatures for each cell type. We observed higher mRNA detection sensitivity with the 10x Genomics 5′ v1 and 3′ v3 methods. We demonstrate that these methods have fewer dropout events, which facilitates the identification of differentially-expressed genes and improves the concordance of single-cell profiles to immune bulk RNA-seq signatures. Conclusion Overall, our characterization of immune cell mixtures provides useful metrics, which can guide selection of a high-throughput single-cell RNA-seq method for profiling more complex immune-cell heterogeneity usually found in vivo.

Download Full-text

High-Throughput Single-Cell RNA-Seq of Large Cells and Nuclei

Genetic Engineering & Biotechnology News ◽

10.1089/gen.37.17.06 ◽

2017 ◽

Vol 37 (17) ◽

pp. 12-13

Author(s):

Jennifer Chew ◽

Adam Bemis ◽

Ronald Lebofsky ◽

Anna Quinlan ◽

Kelly Kaihara

Keyword(s):

Single Cell ◽

High Throughput ◽

Rna Seq

Download Full-text

Strand-specific libraries for high throughput RNA sequencing (RNA-Seq) prepared without poly(A) selection

Silence ◽

10.1186/1758-907x-3-9 ◽

2012 ◽

Vol 3 (1) ◽

pp. 9 ◽

Cited By ~ 89

Author(s):

Zhao Zhang ◽

William E Theurkauf ◽

Zhiping Weng ◽

Phillip D Zamore

Keyword(s):

Rna Sequencing ◽

High Throughput ◽

Rna Seq

Download Full-text

Transcriptomic analysis of Camellia oleifera in response to drought stress using high throughput RNA-seq

Russian Journal of Plant Physiology ◽

10.1134/s1021443717050168 ◽

2017 ◽

Vol 64 (5) ◽

pp. 728-737 ◽

Cited By ~ 1

Author(s):

H. Yang ◽

H. Y. Zhou ◽

X. N. Yang ◽

J. J. Zhan ◽

H. Zhou ◽

...

Keyword(s):

Drought Stress ◽

High Throughput ◽

Transcriptomic Analysis ◽

Camellia Oleifera ◽

Rna Seq

Download Full-text

iSEE: Interactive SummarizedExperiment Explorer

F1000Research ◽

10.12688/f1000research.14966.1 ◽

2018 ◽

Vol 7 ◽

pp. 741 ◽

Cited By ~ 26

Author(s):

Kevin Rue-Albrecht ◽

Federico Marini ◽

Charlotte Soneson ◽

Aaron T.L. Lun

Keyword(s):

High Throughput ◽

Software Package ◽

Biological Data ◽

Data Exploration ◽

Data Sets ◽

Proteomics Data ◽

Code Tracking ◽

Dynamic Linking ◽

Interactive Visualisation ◽

Visual Interface

Data exploration is critical to the comprehension of large biological data sets generated by high-throughput assays such as sequencing. However, most existing tools for interactive visualisation are limited to specific assays or analyses. Here, we present the iSEE (Interactive SummarizedExperiment Explorer) software package, which provides a general visual interface for exploring data in a SummarizedExperiment object. iSEE is directly compatible with many existing R/Bioconductor packages for analysing high-throughput biological data, and provides useful features such as simultaneous examination of (meta)data and analysis results, dynamic linking between plots and code tracking for reproducibility. We demonstrate the utility and flexibility of iSEE by applying it to explore a range of real transcriptomics and proteomics data sets.

Download Full-text

Learning from heterogeneous data sources: an application in spatial proteomics

10.1101/022152 ◽

2015 ◽

Cited By ~ 1

Author(s):

Lisa M. Breckels ◽

Sean Holden ◽

David Wojnar ◽

Claire M. Mulvey ◽

Andy Christoforou ◽

...

Keyword(s):

Mass Spectrometry ◽

Support Vector Machine ◽

Transfer Learning ◽

High Throughput ◽

Cell Biology ◽

Heterogeneous Data ◽

Data Sources ◽

Support Vector ◽

Proteomics Data ◽

Heterogeneous Data Sources

AbstractSub-cellular localisation of proteins is an essential post-translational regulatory mechanism that can be assayed using high-throughput mass spectrometry (MS). These MS-based spatial proteomics experiments enable us to pinpoint the sub-cellular distribution of thousands of proteins in a specific system under controlled conditions. Recent advances in high-throughput MS methods have yielded a plethora of experimental spatial proteomics data for the cell biology community. Yet, there are many third-party data sources, such as immunofluorescence microscopy or protein annotations and sequences, which represent a rich and vast source of complementary information. We present a unique transfer learning classification framework that utilises a nearest-neighbour or support vector machine system, to integrate heterogeneous data sources to considerably improve on the quantity and quality of sub-cellular protein assignment. We demonstrate the utility of our algorithms through evaluation of five experimental datasets, from four different species in conjunction with four different auxiliary data sources to classify proteins to tens of sub-cellular compartments with high generalisation accuracy. We further apply the method to an experiment on pluripotent mouse embryonic stem cells to classify a set of previously unknown proteins, and validate our findings against a recent high resolution map of the mouse stem cell proteome. The methodology is distributed as part of the open-source Bioconductor pRoloc suite for spatial proteomics data analysis.AbbreviationsLOPITLocalisation of Organelle Proteins by Isotope TaggingPCPProtein Correlation ProfilingMLMachine learningTLTransfer learningSVMSupport vector machinePCAPrincipal component analysisGOGene OntologyCCCellular compartmentiTRAQIsobaric tags for relative and absolute quantitationTMTTandem mass tagsMSMass spectrometry

Download Full-text

Study of the root transcriptome of bread wheat using high-throughput RNA sequencing (RNA-SEQ)

Bioinformatics of Genome Regulation and Structure/ Systems Biology ◽

10.18699/bgrs/sb-2020-226 ◽

2020 ◽

Keyword(s):

Rna Sequencing ◽

Bread Wheat ◽

High Throughput ◽

Rna Seq ◽

Root Transcriptome

Download Full-text

HTSeq - A Python framework to work with high-throughput sequencing data

10.1101/002824 ◽

2014 ◽

Cited By ~ 242

Author(s):

Simon Anders ◽

Paul Theodor Pyl ◽

Wolfgang Huber

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Rapid Development ◽

Differential Expression Analysis ◽

Rna Seq ◽

Sequencing Data ◽

Standard Work ◽

Data Formats ◽

High Throughput Sequencing Data ◽

Python Package

Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard work flows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data such as genomic coordinates, sequences, sequencing reads, alignments, gene model information, variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability: HTSeq is released as open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index, https://pypi.python.org/pypi/HTSeq

Download Full-text

Lasy-Seq: a high-throughput library preparation method for RNA-Seq and its application in the analysis of plant responses to fluctuating temperatures

Scientific Reports ◽

10.1038/s41598-019-43600-0 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 9

Author(s):

Mari Kamitani ◽

Makoto Kashima ◽

Ayumi Tezuka ◽

Atsushi J. Nagano

Keyword(s):

High Throughput ◽

Preparation Method ◽

Plant Responses ◽

Library Preparation ◽

Rna Seq ◽

Fluctuating Temperatures ◽

Library Preparation Method

Download Full-text