Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput “omics” Data

Advances in Bioinformatics ◽

10.1155/2010/423589 ◽

2010 ◽

Vol 2010 ◽

pp. 1-19 ◽

Cited By ~ 11

Author(s):

Chuming Chen ◽

Peter B. McGarvey ◽

Hongzhan Huang ◽

Cathy H. Wu

Keyword(s):

High Throughput ◽

Data Driven ◽

Dependent Data ◽

Biological Knowledge ◽

Protein Interaction Data ◽

Omics Data ◽

Interaction Data ◽

Data Standard ◽

Omics Data Integration

High-throughput “omics” technologies bring new opportunities for biological and biomedical researchers to ask complex questions and gain new scientific insights. However, the voluminous, complex, and context-dependent data being maintained in heterogeneous and distributed environments plus the lack of well-defined data standard and standardized nomenclature imposes a major challenge which requires advanced computational methods and bioinformatics infrastructures for integration, mining, visualization, and comparative analysis to facilitate data-driven hypothesis generation and biological knowledge discovery. In this paper, we present the challenges in high-throughput “omics” data integration and analysis, introduce a protein-centric approach for systems integration of large and heterogeneous high-throughput “omics” data including microarray, mass spectrometry, protein sequence, protein structure, and protein interaction data, and use scientific case study to illustrate how one can use varied “omics” data from different laboratories to make useful connections that could lead to new biological knowledge.

Download Full-text

High throughput protein-protein interaction data: clues for the architecture of protein complexes

Proteome Science ◽

10.1186/1477-5956-6-32 ◽

2008 ◽

Vol 6 (1) ◽

pp. 32 ◽

Cited By ~ 1

Author(s):

James R Krycer ◽

Chi Pang ◽

Marc R Wilkins

Keyword(s):

High Throughput ◽

Protein Interaction ◽

Protein Complexes ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction

Download Full-text

A Review of Recent Advancement in Integrating Omics Data with Literature Mining towards Biomedical Discoveries

International Journal of Genomics ◽

10.1155/2017/6213474 ◽

2017 ◽

Vol 2017 ◽

pp. 1-10 ◽

Cited By ~ 15

Author(s):

Kalpana Raja ◽

Matthew Patrick ◽

Yilin Gao ◽

Desmond Madu ◽

Yuyang Yang ◽

...

Keyword(s):

Text Mining ◽

High Throughput ◽

Literature Mining ◽

Biological Knowledge ◽

Omics Data ◽

Huge Number ◽

Automated Generation ◽

The Past ◽

Independent Information ◽

Recent Advancement

In the past decade, the volume of “omics” data generated by the different high-throughput technologies has expanded exponentially. The managing, storing, and analyzing of this big data have been a great challenge for the researchers, especially when moving towards the goal of generating testable data-driven hypotheses, which has been the promise of the high-throughput experimental techniques. Different bioinformatics approaches have been developed to streamline the downstream analyzes by providing independent information to interpret and provide biological inference. Text mining (also known as literature mining) is one of the commonly used approaches for automated generation of biological knowledge from the huge number of published articles. In this review paper, we discuss the recent advancement in approaches that integrate results from omics data and information generated from text mining approaches to uncover novel biomedical information.

Download Full-text

Evaluation and comparison of multi-omics data integration methods for cancer subtyping

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009224 ◽

2021 ◽

Vol 17 (8) ◽

pp. e1009224

Author(s):

Ran Duan ◽

Lin Gao ◽

Yong Gao ◽

Yuxuan Hu ◽

Han Xu ◽

...

Keyword(s):

Comprehensive Evaluation ◽

Practical Importance ◽

Data Driven ◽

Omics Data ◽

Data Types ◽

Integration Methods ◽

Integrative Studies ◽

The Impact ◽

Gold Standards ◽

Omics Data Integration

Computational integrative analysis has become a significant approach in the data-driven exploration of biological problems. Many integration methods for cancer subtyping have been proposed, but evaluating these methods has become a complicated problem due to the lack of gold standards. Moreover, questions of practical importance remain to be addressed regarding the impact of selecting appropriate data types and combinations on the performance of integrative studies. Here, we constructed three classes of benchmarking datasets of nine cancers in TCGA by considering all the eleven combinations of four multi-omics data types. Using these datasets, we conducted a comprehensive evaluation of ten representative integration methods for cancer subtyping in terms of accuracy measured by combining both clustering accuracy and clinical significance, robustness, and computational efficiency. We subsequently investigated the influence of different omics data on cancer subtyping and the effectiveness of their combinations. Refuting the widely held intuition that incorporating more types of omics data always produces better results, our analyses showed that there are situations where integrating more omics data negatively impacts the performance of integration methods. Our analyses also suggested several effective combinations for most cancers under our studies, which may be of particular interest to researchers in omics data analysis.

Download Full-text

Patient-specific multi-omics models and the application in personalized combination therapy

Future Oncology ◽

10.2217/fon-2020-0119 ◽

2020 ◽

Cited By ~ 1

Author(s):

August John ◽

Bo Qin ◽

Krishna R Kalari ◽

Liewei Wang ◽

Jia Yu

Keyword(s):

Combination Therapy ◽

Data Integration ◽

High Throughput ◽

Sharp Decrease ◽

Patient Specific ◽

Clinical Settings ◽

Omics Data ◽

Disease Treatment ◽

Lower Drug ◽

Omics Data Integration

The rapid advancement of high-throughput technologies and sharp decrease in cost have opened up the possibility to generate large amount of multi-omics data on an individual basis. The development of high-throughput -omics, including genomics, epigenomics, transcriptomics, proteomics, metabolomics and microbiomics, enables the application of multi-omics technologies in the clinical settings. Combination therapy, defined as disease treatment with two or more drugs to achieve efficacy with lower doses or lower drug toxicity, is the basis for the care of diseases like cancer. Patient-specific multi-omics data integration can help the identification and development of combination therapies. In this review, we provide an overview of different -omics platforms, and discuss the methods for multi-omics, high-throughput, data integration, personalized combination therapy.

Download Full-text

Making the most of high-throughput protein-interaction data

Genome Biology ◽

10.1186/gb-2007-8-10-112 ◽

2007 ◽

Vol 8 (10) ◽

pp. 112 ◽

Cited By ~ 31

Author(s):

Robert Gentleman ◽

Wolfgang Huber

Keyword(s):

High Throughput ◽

Protein Interaction ◽

Protein Interaction Data ◽

Interaction Data

Download Full-text

An ontology-empowered model for annotating protein-protein interaction data: a case study for budding yeast

2008 IEEE International Conference on Information Reuse and Integration ◽

10.1109/iri.2008.4583057 ◽

2008 ◽

Author(s):

Arash Shaban-Nejad ◽

Volker Haarslev

Keyword(s):

Protein Interaction ◽

Budding Yeast ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction

Download Full-text

Computational identification of signaling pathways in protein interaction networks

F1000Research ◽

10.12688/f1000research.7591.1 ◽

2015 ◽

Vol 4 ◽

pp. 1522

Author(s):

Angela U. Makolo ◽

Temitayo A. Olagunju

Keyword(s):

Signaling Pathways ◽

High Throughput ◽

Protein Interaction ◽

Protein Interaction Network ◽

Interaction Network ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction

The knowledge of signaling pathways is central to understanding the biological mechanisms of organisms since it has been identified that in eukaryotic organisms, the number of signaling pathways determines the number of ways the organism will react to external stimuli. Signaling pathways are studied using protein interaction networks constructed from protein-protein interaction data obtained from high-throughput experiments. However, these high-throughput methods are known to produce very high rates of false positive and negative interactions. To construct a useful protein interaction network from this noisy data, computational methods are applied to validate the protein-protein interactions. In this study, a computational technique to identify signaling pathways from a protein interaction network constructed using validated protein-protein interaction data was designed.A weighted interaction graph of Saccharomyces Cerevisiae was constructed. The weights were obtained using a Bayesian probabilistic network to estimate the posterior probability of interaction between two proteins given the gene expression measurement as biological evidence. Only interactions above a threshold were accepted for the network model.We were able to identify some pathway segments, one of which is a segment of the pathway that signals the start of the process of meiosis in S. Cerevisiae.

Download Full-text

Enter the matrix: factorization uncovers knowledge from omics Names/Affiliations

10.1101/196915 ◽

2017 ◽

Cited By ~ 2

Author(s):

Genevieve L. Stein-O’Brien ◽

Raman Arora ◽

Aedin C. Culhane ◽

Alexander V. Favorov ◽

Lana X. Garmire ◽

...

Keyword(s):

High Throughput ◽

Matrix Factorization ◽

Time Course ◽

High Dimensional Data ◽

Dimensional Structure ◽

High Dimensional ◽

Biological Knowledge ◽

Omics Data ◽

Cellular Interactions ◽

Low Dimensional

AbstractOmics data contains signal from the molecular, physical, and kinetic inter- and intra-cellular interactions that control biological systems. Matrix factorization techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in topics ranging from pathway discovery to time course analysis. We review exemplary applications of matrix factorization for systems-level analyses. We discuss appropriate application of these methods, their limitations, and focus on analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with matrix factorization enables discovery from high-throughput data beyond the limits of current biological knowledge—answering questions from high-dimensional data that we have not yet thought to ask.

Download Full-text

What do we learn from high-throughput protein interaction data?

Expert Review of Proteomics ◽

10.1586/14789450.1.1.111 ◽

2004 ◽

Vol 1 (1) ◽

pp. 111-121 ◽

Cited By ~ 62

Author(s):

Bjorn Titz ◽

Matthias Schlesner ◽

Peter Uetz

Keyword(s):

High Throughput ◽

Protein Interaction ◽

Protein Interaction Data ◽

Interaction Data

Download Full-text

MAPPI-DAT: data management and analysis for protein-protein interaction data from the high-throughput MAPPIT cell microarray platform

Bioinformatics ◽

10.1093/bioinformatics/btx014 ◽

2017 ◽

pp. btx014 ◽

Cited By ~ 1

Author(s):

Surya Gupta ◽

Veronic De Puysseleyr ◽

José Van der Heyden ◽

Davy Maddelein ◽

Irma Lemmens ◽

...

Keyword(s):

Data Management ◽

High Throughput ◽

Protein Interaction ◽

Microarray Platform ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction ◽

Cell Microarray

Download Full-text