Multi-omics Visualization Platform: An extensible Galaxy plug-in for multi-omics data visualization and exploration

GigaScience ◽

10.1093/gigascience/giaa025 ◽

2020 ◽

Vol 9 (4) ◽

Cited By ~ 2

Author(s):

Thomas McGowan ◽

James E Johnson ◽

Praveen Kumar ◽

Ray Sajulga ◽

Subina Mehta ◽

...

Keyword(s):

Data Analysis ◽

Data Visualization ◽

Interactive Visualization ◽

Peptide Identification ◽

Proteomics Data ◽

The Galaxy ◽

Level Information ◽

Galaxy Environment ◽

Data Quality Metrics ◽

Integrated Genomics

Abstract Background Proteogenomics integrates genomics, transcriptomics, and mass spectrometry (MS)-based proteomics data to identify novel protein sequences arising from gene and transcript sequence variants. Proteogenomic data analysis requires integration of disparate ‘omic software tools, as well as customized tools to view and interpret results. The flexible Galaxy platform has proven valuable for proteogenomic data analysis. Here, we describe a novel Multi-omics Visualization Platform (MVP) for organizing, visualizing, and exploring proteogenomic results, adding a critically needed tool for data exploration and interpretation. Findings MVP is built as an HTML Galaxy plug-in, primarily based on JavaScript. Via the Galaxy API, MVP uses SQLite databases as input—a custom data type (mzSQLite) containing MS-based peptide identification information, a variant annotation table, and a coding sequence table. Users can interactively filter identified peptides based on sequence and data quality metrics, view annotated peptide MS data, and visualize protein-level information, along with genomic coordinates. Peptides that pass the user-defined thresholds can be sent back to Galaxy via the API for further analysis; processed data and visualizations can also be saved and shared. MVP leverages the Integrated Genomics Viewer JavaScript framework, enabling interactive visualization of peptides and corresponding transcript and genomic coding information within the MVP interface. Conclusions MVP provides a powerful, extensible platform for automated, interactive visualization of proteogenomic results within the Galaxy environment, adding a unique and critically needed tool for empowering exploration and interpretation of results. The platform is extensible, providing a basis for further development of new functionalities for proteogenomic data visualization.

Download Full-text

79 Statistical Graphics and Interactive Visualization in Animal Science

Journal of Animal Science ◽

10.1093/jas/skab235.079 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 45-45

Author(s):

Gota Morota

Keyword(s):

Data Analysis ◽

Data Visualization ◽

Interactive Visualization ◽

Agricultural Science ◽

Fundamental Aspect ◽

Complex Data ◽

Animal Science ◽

Statistical Graphics ◽

R Shiny ◽

Interactive Data

Abstract Statistical graphics has advanced significantly in recent years with the development of statistical computing tools that allow us to create reports dynamically, facilitate reproducible research, and explore data interactively. In particular, data visualization is a fundamental aspect of big data analysis in animal science. However, the static nature of standard visualization limits the information that can be displayed and extracted. The objectives of this hands-on workshop are to learn how to utilize interactive visualization and investigate both global and local structures of graphs with useful zooming in and zooming out capabilities. We will use the Shiny R package, which is a web application framework for R. A Shiny application has great potential to deliver interactive data analysis and visualization in a web browser. Yet there is limited application of this type of tool in agricultural science. We will learn the capabilities of R Shiny and its use with example applications in animal science and how to aid scientific discoveries and decision-making processes using interactive data exploration tools. After taking this workshop, the participants will be able to understand the concept of R Shiny and develop a web-based interactive visualization tool. The interactive and integrative data visualization features embedded in Shiny applications offer a new resource for users to readily extract extensive information from complex data.

Download Full-text

312 Statistical graphics and interactive visualization in animal science

Journal of Animal Science ◽

10.1093/jas/skaa278.083 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 45-46

Author(s):

Gota Morota

Keyword(s):

Data Analysis ◽

Data Visualization ◽

Interactive Visualization ◽

Agricultural Science ◽

Fundamental Aspect ◽

Complex Data ◽

Animal Science ◽

Statistical Graphics ◽

R Shiny ◽

Interactive Data

Abstract Statistical graphics has advanced significantly in recent years with the development of statistical computing tools that allow us to create reports dynamically, facilitate reproducible research, and explore data interactively. In particular, data visualization is a fundamental aspect of big data analysis in animal science. However, the static nature of standard visualization limits the information that can be displayed and extracted. The objectives of this hands-on workshop are to learn how to utilize interactive visualization and investigate both global and local structures of graphs with useful zooming in and zooming out capabilities. We will use the Shiny R package, which is a web application framework for R. A Shiny application has great potential to deliver interactive data analysis and visualization in a web browser. Yet there is limited application of this type of tool in agricultural science. We will learn the capabilities of R Shiny and its use with example applications in animal science and how to aid scientific discoveries and decision-making processes using interactive data exploration tools. After taking this workshop, the participants will be able to understand the concept of R Shiny and develop a web-based interactive visualization tool. The interactive and integrative data visualization features embedded in Shiny applications offer a new resource for users to readily extract extensive information from complex data.

Download Full-text

XRD Data Visualization, Processing and Analysis with d1Dplot and d2Dplot Software Packages

Proceedings ◽

10.3390/proceedings2020062009 ◽

2020 ◽

Vol 62 (1) ◽

pp. 9

Author(s):

Oriol Vallcorba ◽

Jordi Rius

Keyword(s):

Data Analysis ◽

Data Visualization ◽

X Ray Diffraction ◽

Specific Data ◽

X Ray ◽

Software Packages ◽

Compound Database ◽

Synchrotron Light ◽

User Friendly ◽

Transformation Processes

The d1Dplot and d2Dplot computer programs have been developed as user-friendly tools for the inspection and processing of 1D and 2D X-ray diffraction (XRD) data, respectively. d1Dplot provides general tools for data processing and includes the ability to generate comprehensive 2D plots of multiple patterns to easily follow transformation processes. d2Dplot is a full package for 2D XRD data. Besides general processing tools, it includes specific data analysis routines for the application of the through-the-substrate methodology [Rius et al. IUCrJ 2015, 2, 452–463]. Both programs allow the creation of a user compound database for the identification of crystalline phases. The software can be downloaded from the ALBA Synchrotron Light Source website and can be used free of charge for non-commercial and academic purposes.

Download Full-text

Design of Survey Analysis System Based on Data Visualization

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.1491 ◽

2011 ◽

Vol 403-408 ◽

pp. 1491-1494

Author(s):

Wei Yu

Keyword(s):

Data Analysis ◽

Data Collection ◽

Social Science ◽

Data Visualization ◽

The Internet ◽

Survey Analysis ◽

Science Study ◽

Social Science Study ◽

Analysis System ◽

Three Phases

Questionnaire Survey is one of the most popular methods in Social Science study, the process of which consists of three phases: data collection, analysis and representation. At present, the practical operations of the three phases are still backward, and the methods used in data analysis and statistic are still simple. This paper designed a Survey Analysis System based on data visualization, which can not only realize survey on the internet, but also brought data visualization into the phases of data analysis and representation, so as to help users to obtain and operate data visually and handily.

Download Full-text

Data Visualization and Analysis of Playing Styles in Tennis

Electronic Imaging ◽

10.2352/issn.2470-1173.2021.1.vda-319 ◽

2021 ◽

Keyword(s):

Data Analysis ◽

International Symposium ◽

Data Visualization ◽

Fast Track ◽

Electronic Imaging

Fast track article for IS&T International Symposium on Electronic Imaging 2021: Visualization and Data Analysis proceedings.

Download Full-text

Bottom up proteomics data analysis strategies to explore protein modifications and genomic variants

PROTEOMICS ◽

10.1002/pmic.201400186 ◽

2015 ◽

Vol 15 (11) ◽

pp. 1789-1792 ◽

Cited By ~ 3

Author(s):

Ana Sofia Carvalho ◽

Deborah Penque ◽

Rune Matthiesen

Keyword(s):

Data Analysis ◽

Proteomics Data ◽

Bottom Up ◽

Protein Modifications ◽

Genomic Variants ◽

Analysis Strategies ◽

Proteomics Data Analysis

Download Full-text

Performance Data Visualization of Linux Events on Multicores

10.5753/wscad.2021.18516 ◽

2021 ◽

Author(s):

Claudio Scheer ◽

Renato B. Hoffmann ◽

Dalvan Griebler ◽

Isabel H. Manssour ◽

Luiz G. Fernandes

Keyword(s):

Data Visualization ◽

Large Volume ◽

Real World ◽

Interactive Visualization ◽

Performance Data ◽

Optimization Process ◽

Parallel Applications ◽

Storage Space ◽

Tool Chain ◽

Real World Application

Profiling tools are essential to understand the behavior of parallel applications and assist in the optimization process. However, tools such as Perf generate a large amount of data. This way, they require significant storage space, which also complicates reasoning about this large volume of data. Therefore, we propose VisPerf: a tool-chain and an interactive visualization dashboard for Perf data. The VisPerf tool-chain profiles the application and pre-processes the data, reducing the storage space required by about 50 times. Moreover, we used the visualization dashboard to quickly understand the performance of different events and visualize specific threads and functions of a real-world application.

Download Full-text

Bio-Docklets: Virtualization Containers for Single-Step Execution of NGS Pipelines

10.1101/116962 ◽

2017 ◽

Cited By ~ 3

Author(s):

Baekdoo Kim ◽

Thahmina Ali ◽

Carlos Lijeron ◽

Enis Afgan ◽

Konstantinos Krampis

Keyword(s):

Data Analysis ◽

Cloud Service ◽

Application Programming Interface ◽

Single Step ◽

Easy Access ◽

Complex Data ◽

The Galaxy ◽

Ngs Data Analysis ◽

Single Data ◽

Ngs Data

ABSTRACTBackgroundProcessing of Next-Generation Sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized post-analysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers, towards seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform.FindingsWe present an approach for abstracting the complex data operations of multi-step, bioinformatics pipelines for NGS data analysis. As examples, we have deployed two pipelines for RNAseq and CHIPseq, pre-configured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines is as simple as running a single bioinformatics tool. This is achieved through a “meta-script” that automatically starts the Bio-Docklets, and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface (API). The pipelne output is post-processed using the Visual Omics Explorer (VOE) framework, providing interactive data visualizations that users can access through a web browser.ConclusionsThe goal of our approach is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts, on any computing environment whether a laboratory workstation, university computer cluster, or a cloud service provider,. Besides end-users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.

Download Full-text

Moving translational mass spectrometry imaging towards transparent and reproducible data analyses: A case study of an urothelial cancer cohort analyzed in the Galaxy framework

10.1101/2021.08.09.455649 ◽

2021 ◽

Author(s):

Melanie Christine Föll ◽

Veronika Volkmann ◽

Kathrin Enderle-Ammour ◽

Konrad Wilhelm ◽

Dan Guo ◽

...

Keyword(s):

Mass Spectrometry ◽

Quality Control ◽

Data Analysis ◽

Mass Spectrometry Imaging ◽

Urothelial Cancer ◽

Type I ◽

Raw Data ◽

The Galaxy ◽

Muscle Invasive ◽

Reproducible Manner

Background: Mass spectrometry imaging (MSI) derives spatial molecular distribution maps directly from clinical tissue specimens. This allows for spatial characterization of molecular compositions of different tissue types and tumor subtypes, which bears great potential for assisting pathologists with diagnostic decisions or personalized treatments. Unfortunately, progress in translational MSI is often hindered by insufficient quality control and lack of reproducible data analysis. Raw data and analysis scripts are rarely publicly shared. Here, we demonstrate the application of the Galaxy MSI tool set for the reproducible analysis of an urothelial carcinoma dataset. Methods: Tryptic peptides were imaged in a cohort of 39 formalin-fixed, paraffin-embedded human urothelial cancer tissue cores with a MALDI-TOF/TOF device. The complete data analysis was performed in a fully transparent and reproducible manner on the European Galaxy Server. Annotations of tumor and stroma were performed by a pathologist and transferred to the MSI data to allow for supervised classifications of tumor vs. stroma tissue areas as well as for muscle-infiltrating and non-muscle invasive urothelial carcinomas. For putative peptide identifications, m/z features were matched to the MSiMass list. Results: Rigorous quality control in combination with careful pre-processing enabled reduction of m/z shifts and intensity batch effects. High classification accuracy was found for both, tumor vs. stroma and muscle-infiltrating vs. non-muscle invasive tumors. Some of the most discriminative m/z features for each condition could be assigned a putative identity: Stromal tissue was characterized by collagen type I peptides and tumor tissue by histone and heat shock protein beta-1 peptides. Intermediate filaments such as cytokeratins and vimentin were discriminative between the tumors with different muscle-infiltration status. To make the study fully reproducible and to advocate the criteria of FAIR (findability, accessibility, interoperability, and reusability) research data, we share the raw data, spectra annotations as well as all Galaxy histories and workflows. Data are available via ProteomeXchange with identifier PXD026459 and Galaxy results via https://github.com/foellmelanie/Bladder_MSI_Manuscript_Galaxy_links. Conclusion: Here, we show that translational MSI data analysis in a fully transparent and reproducible manner is possible and we would like to encourage the community to join our efforts.

Download Full-text