scholarly journals Exploring Integrative Analysis using the BioMedical Evidence Graph

2019 ◽  
Author(s):  
Adam Struck ◽  
Brian Walsh ◽  
Alexander Buchanan ◽  
Jordan A. Lee ◽  
Ryan Spangler ◽  
...  

AbstractThe analysis of cancer biology data involves extremely heterogeneous datasets including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenomic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrative analysis. We introduce a graph database and query engine for discovery and analysis of cancer biology, called the BioMedical Evidence Graph (BMEG). The BMEG is unique from other biological data graphs in that sample level molecular information is connected to reference knowledge bases. It combines gene expression and mutation data, with drug response experiments, pathway information databases and literature derived associations. The construction of the BMEG has resulted in a graph containing over 36M vertices and 29M edges. The BMEG system provides a graph query based API to enable analysis, with client code available for Python, Javascript and R, and a server online at bmeg.io. Using this system we have developed several forms of integrated analysis to demonstrate the utility of the system. The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug response machine learning, patient level knowledge base queries and pathway level analysis. We have compared the resulting graph to other available integrated graph systems, and demonstrated that it is unique in the scale of the graph and the type of data it makes available.HighlightsData resource connected extremely diverse set of cancer data setsGraph query engine that can be easily deployed and used on new datasetsEasily installed python clientServer online at bmeg.ioSummaryThe analysis of cancer biology data involves extremely heterogeneous datasets including information. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrative analysis. We introduce a graph database and query engine for discovery and analysis of cancer biology, called the BioMedical Evidence Graph (BMEG). The construction of the BMEG has resulted in a graph containing over 36M vertices and 29M edges. The BMEG system provides a graph query based API to enable analysis, with client code available for Python, Javascript and R, and a server online at bmeg.io. Using this system we have developed several forms of integrated analysis to demonstrate the utility of the system.

2020 ◽  
pp. 147-159 ◽  
Author(s):  
Adam Struck ◽  
Brian Walsh ◽  
Alexander Buchanan ◽  
Jordan A. Lee ◽  
Ryan Spangler ◽  
...  

PURPOSE The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis. METHODS We introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations. RESULTS The construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query–based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross–data set analysis to show the utility of the system. CONCLUSION The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available.


Genes ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 549
Author(s):  
Amal Qattan ◽  
Taher Al-Tweigeri ◽  
Wafa Alkhayal ◽  
Kausar Suleman ◽  
Asma Tulbah ◽  
...  

Resistance to therapy is a persistent problem that leads to mortality in breast cancer, particularly triple-negative breast cancer (TNBC). MiRNAs have become a focus of investigation as tissue-specific regulators of gene networks related to drug resistance. Circulating miRNAs are readily accessible non-invasive potential biomarkers for TNBC diagnosis, prognosis, and drug-response. Our aim was to use systems biology, meta-analysis, and network approaches to delineate the drug resistance pathways and clinical outcomes associated with circulating miRNAs in TNBC patients. MiRNA expression analysis was used to investigate differentially regulated circulating miRNAs in TNBC patients, and integrated pathway regulation, gene ontology, and pharmacogenomic network analyses were used to identify target genes, miRNAs, and drug interaction networks. Herein, we identified significant differentially expressed circulating miRNAs in TNBC patients (miR-19a/b-3p, miR-25-3p, miR-22-3p, miR-210-3p, miR-93-5p, and miR-199a-3p) that regulate several molecular pathways (PAM (PI3K/Akt/mTOR), HIF-1, TNF, FoxO, Wnt, and JAK/STAT, PD-1/PD-L1 pathways and EGFR tyrosine kinase inhibitor resistance (TKIs)) involved in drug resistance. Through meta-analysis, we demonstrated an association of upregulated miR-93, miR-210, miR-19a, and miR-19b with poor overall survival outcomes in TNBC patients. These results identify miRNA-regulated mechanisms of drug resistance and potential targets for combination with chemotherapy to overcome drug resistance in TNBC. We demonstrate that integrated analysis of multi-dimensional data can unravel mechanisms of drug-resistance related to circulating miRNAs, particularly in TNBC. These circulating miRNAs may be useful as markers of drug response and resistance in the guidance of personalized medicine for TNBC.


Author(s):  
Sai Moturu

As John Muir noted, “When we try to pick out anything by itself, we find it hitched to everything else in the Universe” (Muir, 1911). In tune with Muir’s elegantly stated notion, research in molecular biology is progressing toward a systems level approach, with a goal of modeling biological systems at the molecular level. To achieve such a lofty goal, the analysis of multiple datasets is required to form a clearer picture of entire biological systems (Figure 1). Traditional molecular biology studies focus on a specific process in a complex biological system. The availability of high-throughput technologies allows us to sample tens of thousands of features of biological samples at the molecular level. Even so, these are limited to one particular view of a biological system governed by complex relationships and feedback mechanisms on a variety of levels. Integrated analysis of varied biological datasets from the genetic, translational, and protein levels promises more accurate and comprehensive results, which help discover concepts that cannot be found through separate, independent analyses. With this article, we attempt to provide a comprehensive review of the existing body of research in this domain.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Jianzhi Shi ◽  
Wenlei Wang ◽  
Yinghui Lin ◽  
Kai Xu ◽  
Yan Xu ◽  
...  

Abstract Background Pyropia haitanensis, distributes in the intertidal zone, can tolerate water losses exceeding 90%. However, the mechanisms enabling P. haitanensis to survive harsh conditions remain uncharacterized. To elucidate the mechanism underlying P. haitanensis desiccation tolerance, we completed an integrated analysis of its transcriptome and proteome as well as transgenic Chlamydomonas reinhardtii carrying a P. haitanensis gene. Results P. haitanensis rapidly adjusted its physiological activities to compensate for water losses up to 60%, after which, photosynthesis, antioxidant systems, chaperones, and cytoskeleton were activated to response to severe desiccation stress. The integrative analysis suggested that transketolase (TKL) was affected by all desiccation treatments. Transgenic C. reinhardtii cells overexpressed PhTKL grew better than the wild-type cells in response to osmotic stress. Conclusion P. haitanensis quickly establishes acclimatory homeostasis regarding its transcriptome and proteome to ensure its thalli can recover after being rehydrated. Additionally, PhTKL is vital for P. haitanensis desiccation tolerance. The present data may provide new insights for the breeding of algae and plants exhibiting enhanced desiccation tolerance.


2008 ◽  
Vol 26 (5) ◽  
pp. 531-539 ◽  
Author(s):  
Zoltán Kutalik ◽  
Jacques S Beckmann ◽  
Sven Bergmann

2021 ◽  
Vol 12 ◽  
Author(s):  
Inuk Jung ◽  
Minsu Kim ◽  
Sungmin Rhee ◽  
Sangsoo Lim ◽  
Sun Kim

Multi-omics data is frequently measured to enrich the comprehension of biological mechanisms underlying certain phenotypes. However, due to the complex relations and high dimension of multi-omics data, it is difficult to associate omics features to certain biological traits of interest. For example, the clinically valuable breast cancer subtypes are well-defined at the molecular level, but are poorly classified using gene expression data. Here, we propose a multi-omics analysis method called MONTI (Multi-Omics Non-negative Tensor decomposition for Integrative analysis), which goal is to select multi-omics features that are able to represent trait specific characteristics. Here, we demonstrate the strength of multi-omics integrated analysis in terms of cancer subtyping. The multi-omics data are first integrated in a biologically meaningful manner to form a three dimensional tensor, which is then decomposed using a non-negative tensor decomposition method. From the result, MONTI selects highly informative subtype specific multi-omics features. MONTI was applied to three case studies of 597 breast cancer, 314 colon cancer, and 305 stomach cancer cohorts. For all the case studies, we found that the subtype classification accuracy significantly improved when utilizing all available multi-omics data. MONTI was able to detect subtype specific gene sets that showed to be strongly regulated by certain omics, from which correlation between omics types could be inferred. Furthermore, various clinical attributes of nine cancer types were analyzed using MONTI, which showed that some clinical attributes could be well explained using multi-omics data. We demonstrated that integrating multi-omics data in a gene centric manner improves detecting cancer subtype specific features and other clinical features, which may be used to further understand the molecular characteristics of interest. The software and data used in this study are available at: https://github.com/inukj/MONTI.


Author(s):  
Yuhan Hao ◽  
Stephanie Hao ◽  
Erica Andersen-Nissen ◽  
William M. Mauck ◽  
Shiwei Zheng ◽  
...  

AbstractThe simultaneous measurement of multiple modalities, known as multimodal analysis, represents an exciting frontier for single-cell genomics and necessitates new computational methods that can define cellular states based on multiple data types. Here, we introduce ‘weighted-nearest neighbor’ analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of hundreds of thousands of human white blood cells alongside a panel of 228 antibodies to construct a multimodal reference atlas of the circulating immune system. We demonstrate that integrative analysis substantially improves our ability to resolve cell states and validate the presence of previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets, and to interpret immune responses to vaccination and COVID-19. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets, including paired measurements of RNA and chromatin state, and to look beyond the transcriptome towards a unified and multimodal definition of cellular identity.AvailabilityInstallation instructions, documentation, tutorials, and CITE-seq datasets are available at http://www.satijalab.org/seurat


2020 ◽  
Author(s):  
Qiao Liu ◽  
Zhiqiang Hu ◽  
Rui Jiang ◽  
Mu Zhou

AbstractMotivationAccurate prediction of cancer drug response (CDR) is challenging due to the uncertainty of drug efficacy and heterogeneity of cancer patients. Strong evidences have implicated the high dependence of CDR on tumor genomic and transcriptomic profiles of individual patients. Precise identification of CDR is crucial in both guiding anti-cancer drug design and understanding cancer biology.ResultsIn this study, we present DeepCDR which integrates multi-omics profiles of cancer cells and explores intrinsic chemical structures of drugs for predicting cancer drug response. Specifically, DeepCDR is a hybrid graph convolutional network consisting of a uniform graph convolutional network (UGCN) and multiple subnetworks. Unlike prior studies modeling hand-crafted features of drugs, DeepCDR automatically learns the latent representation of topological structures among atoms and bonds of drugs. Extensive experiments showed that DeepCDR outperformed state-of-the-art methods in both classification and regression settings under various data settings. We also evaluated the contribution of different types of omics profiles for assessing drug response. Furthermore, we provided an exploratory strategy for identifying potential cancer-associated genes concerning specific cancer types. Our results highlighted the predictive power of DeepCDR and its potential translational value in guiding disease-specific drug design.AvailabilityDeepCDR is freely available at https://github.com/kimmo1019/[email protected]; [email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Barbara Yang ◽  
Enrique Ivan Ramos ◽  
Ramesh Choudhari ◽  
Mina Zilaie ◽  
Laura A Sanchez-Michael ◽  
...  

Abstract Long noncoding RNAs (lncRNAs) have been demonstrated to be involved in diverse cellular processes as important regulators, such as in cancer. However, their roles in breast cancer biology are greatly unknown so far. In our study, integrated analysis of subcellular fractionation RNA-seq with gene expression profile from human reproductive tissues yielded a comprehensive catalog of estrogen-regulated reproductive tissue-specific lncRNAs. We selected long intergenic noncoding RNA 16 (LINC16) for further study as it was the top upregulated lncRNA by estrogen and associates with clinical outcome. Analysis of RNA-seq data from different human tissues, we found that LINC16 is highly expressed in testis and followed by other reproductive organs, cervix and uterus. Interestingly, interrogation of expression data from human cancer tissues showed LINC16 is highly expressed in breast cancer compared to other cancers. We have determined the 5’ and 3’ ends of LINC16 and its exon/intron structure, and cloned LINC16 to study its function in molecular and cell-based assays. Our preliminary results suggest that LINC16 plays a critical role in ERα-dependent pathways.


2021 ◽  
Vol 12 ◽  
Author(s):  
Rency S. Varghese ◽  
Megan E. Barefoot ◽  
Sidharth Jain ◽  
Yifan Chen ◽  
Yunxi Zhang ◽  
...  

Pathologic alterations in epigenetic regulation have long been considered a hallmark of many cancers, including hepatocellular carcinoma (HCC). In a healthy individual, the relationship between DNA methylation and microRNA (miRNA) expression maintains a fine balance; however, disruptions in this harmony can aid in the genesis of cancer or the propagation of existing cancers. The balance between DNA methylation and microRNA expression and its potential disturbance in HCC can vary by race. There is emerging evidence linking epigenetic events including DNA methylation and miRNA expression to cancer disparities. In this paper, we evaluate the epigenetic mechanisms of racial heterogenity in HCC through an integrated analysis of DNA methylation, miRNA, and combined regulation of gene expression. Specifically, we generated DNA methylation, mRNA-seq, and miRNA-seq data through the analysis of tumor and adjacent non-tumor liver tissues from African Americans (AA) and European Americans (EA) with HCC. Using mixed ANOVA, we identified cytosine-phosphate-guanine (CpG) sites, mRNAs, and miRNAs that are significantly altered in HCC vs. adjacent non-tumor tissue in a race-specific manner. We observed that the methylome was drastically changed in EA with a significantly larger number of differentially methylated and differentially expressed genes than in AA. On the other hand, the miRNA expression was altered to a larger extent in AA than in EA. Pathway analysis functionally linked epigenetic regulation in EA to processes involved in immune cell maturation, inflammation, and vascular remodeling. In contrast, cellular proliferation, metabolism, and growth pathways are found to predominate in AA as a result of this epigenetic analysis. Furthermore, through integrative analysis, we identified significantly differentially expressed genes in HCC with disparate epigenetic regulation, associated with changes in miRNA expression for AA and DNA methylation for EA.


Sign in / Sign up

Export Citation Format

Share Document