Development of KBCommons : universal informatics framework for multi-omics translational research

Mapping Intimacies ◽

10.32469/10355/66748 ◽

2017 ◽

Author(s):

◽

Siva Ratna Kumari Narisetti

Keyword(s):

Gene Expression ◽

Knowledge Base ◽

Differential Expression ◽

Biological Data ◽

Data Type ◽

Omics Data ◽

Web Based ◽

Coding Sequence ◽

Plain Text ◽

Web Resource

Multi-level 'OMICS' data integration for multiple organisms has been one of the major challenges in the era of advanced next generation sequencing and high performance technologies. Biological data has been producing tremendously fast with the availability of these high throughput sequencing technologies at low price and high speed. However, these data are often stored individually across different web resources based on data type and organism, making it difficult to find and integrate them. There are many websites available which store data from different data types and display that data in pie charts or plain text format but limit their data to only one fixed organism. These web-based multi-omics analysis is an efficient and easy way of analyzing the data but it would be difficult for other researchers working with other organisms and with complex data. The complex multi-omics data requires extensive data management, exhaustive computational analysis, and effective integration to have a one-stop interactive, web-based portal to browse, access, analyze, integrate and share knowledge about genomics and molecular mechanisms, with ultimate links to phenotypes and traits for many different organisms. To achieve this, we have developed Knowledge Base Commons (KBCommons), a platform that automates the process of establishing the database and making the tools available for organisms via a dedicated web resource. KBCommons is currently supporting four different categories including Plants and Crops; Animals and Pets; Humans and Diseases; Microbes and Viruses. It has four main functionalities including Browse KBCommons, Contribute to KB, Add version to KB, and Create a new KB. Using KBCommons, researchers from different groups with different organisms' data can be shared and accessed among all. KBCommons is an automatic framework which uses famous and widely used Laravel PHP framework. This is very efficient to deal with complex and diverse biological datasets. In the Browse KBCommons section, all existing organisms will be displayed under each category and it also shows organisms which can be used as model organisms. KBCommons also displays the logo of each organism along with existing versions, in this way it will give a detailed information on all existing organisms. The user can browse existing data of each organism using various tools including Blast, Multiple Sequence Alignment, Motif Sampler, etc., by going to that particular page. Users can also visualize gene expression and differential expression data via pie charts and plain text. Add version to KB and Create a new KB are related because of their similar steps in the process, users must bring corresponding data in each section. When a particular organism of interest is not existing then the user can create a new Knowledge Base for that new organism with 6 essential files of Genome Sequence, protein coding sequence for Amino acid, gene coding sequence for Nucleotide and Spliced mRNA transcripts, mRNA sequences in GFF3, and a functional annotation file. In Add version to KB, if an organism is already existing then the user can add a new version to the existing KB with these 6 essential files for the new version. In Contribute to KB, user can upload multi-omics data including Transcriptomics -- RNA-Seq and Microarray; Proteomics -- Mass Spectrometry and 2DGel; Epigenomics -- Bisulphite Sequencing, Methylation Array, and MBD-Seq Array. We support both gene expression/ protein expression/ or methylation data and differential expression comparison for each data type. We also support different entities including miRNA/sRNA, Metabolite, SNP/GWAS, Plant introduction lines/ Animal strains, and Phenotype/ TRAIT/Diseases.

Download Full-text

Knowledge Base Commons (KBCommons) v1.1: a universal framework for multi-omics data integration and biological discoveries

BMC Genomics ◽

10.1186/s12864-019-6287-8 ◽

2019 ◽

Vol 20 (S11) ◽

Cited By ~ 3

Author(s):

Shuai Zeng ◽

Zhen Lyu ◽

Siva Ratna Kumari Narisetti ◽

Dong Xu ◽

Trupti Joshi

Keyword(s):

Knowledge Base ◽

Data Storage ◽

High Performance ◽

Data Retrieval ◽

Omics Data ◽

Web Interface ◽

Web Resource ◽

Privilege Management ◽

Multi Level ◽

Comprehensive Framework

Abstract Background Knowledge Base Commons (KBCommons) v1.1 is a universal and all-inclusive web-based framework providing generic functionalities for storing, sharing, analyzing, exploring, integrating and visualizing multiple organisms’ genomics and integrative omics data. KBCommons is designed and developed to integrate diverse multi-level omics data and to support biological discoveries for all species via a common platform. Methods KBCommons has four modules including data storage, data processing, data accessing, and web interface for data management and retrieval. It provides a comprehensive framework for new plant-specific, animal-specific, virus-specific, bacteria-specific or human disease-specific knowledge base (KB) creation, for adding new genome versions and additional multi-omics data to existing KBs, and for exploring existing datasets within current KBs. Results KBCommons has an array of tools for data visualization and data analytics such as multiple gene/metabolite search, gene family/Pfam/Panther function annotation search, miRNA/metabolite/trait/SNP search, differential gene expression analysis, and bulk data download capacity. It contains a highly reliable data privilege management system to make users’ data publicly available easily and to share private or pre-publication data with members in their collaborative groups safely and securely. It allows users to conduct data analysis using our in-house developed workflow functionalities that are linked to XSEDE high performance computing resources. Using KBCommons’ intuitive web interface, users can easily retrieve genomic data, multi-omics data and analysis results from workflow according to their requirements and interests. Conclusions KBCommons addresses the needs of many diverse research communities to have a comprehensive multi-level OMICS web resource for data retrieval, sharing, analysis and visualization. KBCommons can be publicly accessed through a dedicated link for all organisms at http://kbcommons.org/.

Download Full-text

PaintOmics 3: a web resource for the pathway analysis and visualization of multi-omics data

10.1101/281295 ◽

2018 ◽

Author(s):

Rafael Hernández-de-Diego ◽

Sonia Tarazona ◽

Carlos Martínez-Mira ◽

Leandro Balzano-Nogueira ◽

Pedro Furió-Tarí ◽

...

Keyword(s):

Data Analysis ◽

Pathway Analysis ◽

Feature Matching ◽

Omics Data ◽

Data Types ◽

Web Based ◽

Web Resource ◽

Analysis Workflow ◽

Pathway Diagrams ◽

Molecular Layers

ABSTRACTThe increasing availability of multi-omic platforms poses new challenges to data analysis. Joint visualization of multi-omics data is instrumental to understand interconnections across molecular layers and to fully leverage the biology discovery power offered by the multi-omics approach.We present here PaintOmics 3, a web-based resource for the integrated visualization of multiple omic data types onto KEGG pathway diagrams. PaintOmics 3 combines server-end capabilities for data analysis with the potential of modern web resources for data visualization, providing researchers with a powerful framework for interactive exploration of their multi-omics information.Unlike other visualization tools, PaintOmics 3 covers a complete pathway analysis workflow, including automatic feature name/identifier conversion, multi-layered feature matching, pathway enrichment, network analysis, interactive heatmaps, trend charts, etc. It accepts a wide variety of omic types, including transcriptomics, proteomics and metabolomics, as well as region-based approaches such as ATAC-seq or ChIP-seq data. The tool is freely available at http://bioinfo.cipf.es/paintomics/.

Download Full-text

ADAS-viewer: web-based application for integrative analysis of multi-omics data in Alzheimer’s disease

npj Systems Biology and Applications ◽

10.1038/s41540-021-00177-7 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Seonggyun Han ◽

Jaehang Shin ◽

Hyeim Jung ◽

Jane Ryu ◽

Habtamu Minassie ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Neurodegenerative Disorder ◽

Inferior Frontal Gyrus ◽

Temporal Cortex ◽

Brain Regions ◽

Biological Data ◽

Omics Data ◽

Molecular Architecture ◽

Web Based

AbstractsAlzheimer’s disease (AD) is a neurodegenerative disorder and is represented by complicated biological mechanisms and complexity of brain tissue. Our understanding of the complicated molecular architecture that contributes to AD progression benefits from performing comprehensive and systemic investigations with multi-layered molecular and biological data from different brain regions. Since recently different independent studies generated various omics data in different brain regions of AD patients, multi-omics data integration can be a useful resource for better comprehensive understanding of AD. Here we present a web platform, ADAS-viewer, that provides researchers with the ability to comprehensively investigate and visualize multi-omics data from multiple brain regions of AD patients. ADAS-viewer offers means to identify functional changes in transcript and exon expression (i.e., alternative splicing) along with associated genetic or epigenetic regulatory effects. Specifically, it integrates genomic, transcriptomic, methylation, and miRNA data collected from seven different brain regions (cerebellum, temporal cortex, dorsolateral prefrontal cortex, frontal pole, inferior frontal gyrus, parahippocampal gyrus, and superior temporal gyrus) across three independent cohort datasets. ADAS-viewer is particularly useful as a web-based application for analyzing and visualizing multi-omics data across multiple brain regions at both transcript and exon level, allowing the identification of candidate biomarkers of Alzheimer’s disease.

Download Full-text

GEOMetaCuration: A web-based application for accurate manual curation of Gene Expression Omnibus metadata

10.1101/257444 ◽

2018 ◽

Author(s):

Zhao Li ◽

Jin Li ◽

Peng Yu

Keyword(s):

Gene Expression ◽

Large Scale ◽

Gene Expression Omnibus ◽

Biological Data ◽

Use Case ◽

Web Based ◽

Link Type ◽

Manual Curation ◽

Development Framework ◽

Biological Discovery

AbstractMetadata curation has become increasingly important for biological discovery and biomedical research because a large amount of heterogeneous biological data is currently freely available. To facilitate efficient metadata curation, we developed an easy-to-use web-based curation application, GEOMetaCuration, for curating the metadata of Gene Expression Omnibus datasets. It can eliminate mechanical operations that consume precious curation time and can help coordinate curation efforts among multiple curators. It improves the curation process by introducing various features that are critical to metadata curation, such as a back-end curation management system and a curator-friendly front-end. The application is based on a commonly used web development framework of Python/Django and is open-sourced under the GNU General Public License V3. GEOMetaCuration is expected to benefit the biocuration community and to contribute to computational generation of biological insights using large-scale biological data. An example use case can be found at the demo website: http://geometacuration.yubiolab.org. Source code URL: https://bitbucket.com/yubiolab/GEOMetaCuration

Download Full-text

Development of Soybean Knowledge Base (SoyKB), a multi-omics data integration web resource for bridging molecular breeding and translational genomics in Glycine Max

10.32469/10355/43323 ◽

2013 ◽

Author(s):

Trupti Joshi

Keyword(s):

Glycine Max ◽

Data Integration ◽

Knowledge Base ◽

Molecular Breeding ◽

Omics Data ◽

Translational Genomics ◽

Web Resource ◽

Omics Data Integration

Download Full-text

Differential Expression Gene Explorer (DrEdGE): a tool for generating interactive online visualizations of gene expression datasets

Bioinformatics ◽

10.1093/bioinformatics/btz972 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2581-2583 ◽

Cited By ~ 2

Author(s):

Sophia C Tintori ◽

Patrick Golden ◽

Bob Goldstein

Keyword(s):

Gene Expression ◽

Caenorhabditis Elegans ◽

Differential Expression ◽

Supplementary Information ◽

Supplementary Data ◽

Online Data ◽

Web Based ◽

Neuronal Tissue ◽

Differential Expression Gene ◽

Data Visualizations

Abstract Summary Differential Expression Gene Explorer (DrEdGE) is a web-based tool that guides genomicists through easily creating interactive online data visualizations, which colleagues can query according to their own conditions to discover genes, samples or patterns of interest. We demonstrate DrEdGE’s features with three example websites generated from publicly available datasets—human neuronal tissue, mouse embryonic tissue and Caenorhabditis elegans whole embryos. DrEdGE increases the utility of large genomics datasets by removing technical obstacles to independent exploration. Availability and implementation Freely available at http://dredge.bio.unc.edu. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Bloodspot: A Web Resource Facilitating the Analysis of Transcriptional Programs in Normal and Malignant Hematopoiesis

Blood ◽

10.1182/blood.v126.23.2358.2358 ◽

2015 ◽

Vol 126 (23) ◽

pp. 2358-2358

Author(s):

Frederik Otzen Bagger ◽

Damir Sasivarevic ◽

Sina Hadi Sohi ◽

Linea Gøricke Laursen ◽

Sachin Pundhir ◽

...

Keyword(s):

Gene Expression ◽

Conflicts Of Interest ◽

Expression Profiles ◽

Single Gene ◽

Expression Patterns ◽

Web Based ◽

Web Resource ◽

Kaplan Meier ◽

Blood Formation ◽

Key Aspects

Abstract Decades of studies on developing cells in human and murine haematopoiesis have resulted in a large number of gene-expression datasets that may answer questions regarding normal and aberrant blood formation. To researchers and clinicians with limited bioinformatics experience, these data remain available, yet largely inaccessible. Current resources provide information about gene-expression patterns but disregard key aspects such as genetic co-regulation of genes, and the effects on patient survival. Here, we present a new web-based resourced termed, BloodSpot, which provides a) a comprehensive representation of gene-expression throughout haematopoiesis, b) a single gene-based Kaplan-Meier analysis and c) a novel, simpler, yet more informative, type of expression plot. Significantly, users can compare their own expression profiles to normal haematopietic populations within the statistical framework of Bloodspot. We illustrate the potential of key features in BloodSpot to identify new putative C/EBPa targets. Accessible at http://servers.binf.ku.dk/bloodspot/ Disclosures No relevant conflicts of interest to declare.

Download Full-text

Prediction of cancer mutation states using multiple data modalities reveals the utility and consistency of gene expression and DNA methylation

10.1101/2021.10.27.466140 ◽

2021 ◽

Author(s):

Jake Crawford ◽

Brock C Christensen ◽

Maria Chikina ◽

Casey S Greene

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Rna Sequencing ◽

Genetic Alterations ◽

Cellular Function ◽

Data Type ◽

Omics Data ◽

Data Types ◽

Cancer Mutation ◽

Combining Data

In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal may be unclear. In this study, we consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem. Since functional signatures of cancer mutation have been identified across many data types, this problem presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. The TCGA Pan-Cancer Atlas contains genetic alteration data including somatic mutations and copy number variants (CNVs), as well as several -omics data types. From TCGA, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of cancer-associated genetic alterations, RNA sequencing and DNA methylation were the most effective predictors of alteration state. Surprisingly, we found that for most alterations, they were approximately equally effective predictors. The target gene was the primary driver of performance, rather than the data type, and there was little difference between the top data types for the majority of genes. We also found that combining data types into a single multi-omics model often provided little or no improvement in predictive ability over the best individual data type. Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, we recommend focusing on gene expression or DNA methylation as first-line readouts.

Download Full-text