scholarly journals DeepSide: A Deep Learning Framework for Drug Side Effect Prediction

2019 ◽  
Author(s):  
Onur Can Uner ◽  
Ramazan Gokberk Cinbis ◽  
Oznur Tastan ◽  
A. Ercument Cicek

AbstractDrug failures due to unforeseen adverse effects at clinical trials pose health risks for the participants and lead to substantial financial losses. Side effect prediction algorithms have the potential to guide the drug design process. LINCS L1000 dataset provides a vast resource of cell line gene expression data perturbed by different drugs and creates a knowledge base for context specific features. The state-of-the-art approach that aims at using context specific information relies on only the high-quality experiments in LINCS L1000 and discards a large portion of the experiments. In this study, our goal is to boost the prediction performance by utilizing this data to its full extent. We experiment with 5 deep learning architectures. We find that a multi-modal architecture produces the best predictive performance among multi-layer perceptron-based architectures when drug chemical structure (CS), and the full set of drug perturbed gene expression profiles (GEX) are used as modalities. Overall, we observe that the CS is more informative than the GEX. A convolutional neural network-based model that uses only SMILES string representation of the drugs achieves the best results and provides 13.0% macro-AUC and 3.1% micro-AUC improvements over the state-of-the-art. We also show that the model is able to predict side effect-drug pairs that are reported in the literature but was missing in the ground truth side effect dataset. DeepSide is available at http://github.com/OnurUner/DeepSide.

2012 ◽  
Vol 07 (01n02) ◽  
pp. 41-70 ◽  
Author(s):  
JASON SHULMAN ◽  
LARS SEEMANN ◽  
GREGG W. ROMAN ◽  
GEMUNU H. GUNARATNE

Networks are used to abstract large, highly-coupled sets of objects. Their analyses have included network classification into a few broad classes and selection of small substructures that perform simple yet common tasks. One issue that has received little attention is how the state of a network can be moved according to a pre-specified set of guidelines. In this paper, we address this question in the context of gene networks. In general, neither the full membership of the gene network associated with a biological process nor the precise form of interactions between nodes is known. What is available, through microarrays or sequencing, are gene expression profiles of an organism or its viable mutants. Our approach relies only on these expression profiles, and not on the availability of an accurate model for the network. The first step is to select a small set of core- or master- nodes, such as transcription factors or microRNAs, that can be used to alter the levels of many of the remaining genes in the network. We ask how the state — or solution — of the gene network changes as the levels of these master nodes are altered externally. The object of our study is, not the network, but the surface of these solutions. We argue that it can be approximated using gene expression profiles of the organism and single manipulation of master node activity. This is done through an "effective model." The effective model as well as error estimates for its predictions can be derived from experimental data. The method is validated using synthetic gene networks that have stationary solutions and those that are periodically driven, e.g., circadian networks. An effective model for the oxygen-deprivation network of E.coli is constructed using previously published gene expression profiles, and used to predict the expression levels in a double knockout mutant. Less that 30% of the predictions lie outside the 5% confidence level. We propose the use of the effective model methodology to compute how Drosophila melanogaster in the normal state can be genetically altered into a pre-defined sleep deprived-like state.


2006 ◽  
Vol 22 (14) ◽  
pp. 1737-1744 ◽  
Author(s):  
X. Liu ◽  
S. Sivaganesan ◽  
K. Y. Yeung ◽  
J. Guo ◽  
R. E. Bumgarner ◽  
...  

2019 ◽  
Vol 40 (5) ◽  
pp. 624-632
Author(s):  
Ji-Wei Chang ◽  
Yuduan Ding ◽  
Muhammad Tahir ul Qamar ◽  
Yin Shen ◽  
Junxiang Gao ◽  
...  

Abstract Prioritization of cancer-related genes from gene expression profiles and proteomic data is vital to improve the targeted therapies research. Although computational approaches have been complementing high-throughput biological experiments on the understanding of human diseases, it still remains a big challenge to accurately discover cancer-related proteins/genes via automatic learning from large-scale protein/gene expression data and protein–protein interaction data. Most of the existing methods are based on network construction combined with gene expression profiles, which ignore the diversity between normal samples and disease cell lines. In this study, we introduced a deep learning model based on a sparse auto-encoder to learn the specific characteristics of protein interactions in cancer cell lines integrated with protein expression data. The model showed learning ability to identify cancer-related proteins/genes from the input of different protein expression profiles by extracting the characteristics of protein interaction information, which could also predict cancer-related protein combinations. Comparing with other reported methods including differential expression and network-based methods, our model got the highest area under the curve value (>0.8) in predicting cancer-related genes. Our study prioritized ~500 high-confidence cancer-related genes; among these genes, 211 already known cancer drug targets were found, which supported the accuracy of our method. The above results indicated that the proposed auto-encoder model could computationally prioritize candidate proteins/genes involved in cancer and improve the targeted therapies research.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shengqiao Gao ◽  
Lu Han ◽  
Dan Luo ◽  
Gang Liu ◽  
Zhiyong Xiao ◽  
...  

Abstract Background Querying drug-induced gene expression profiles with machine learning method is an effective way for revealing drug mechanism of actions (MOAs), which is strongly supported by the growth of large scale and high-throughput gene expression databases. However, due to the lack of code-free and user friendly applications, it is not easy for biologists and pharmacologists to model MOAs with state-of-art deep learning approach. Results In this work, a newly developed online collaborative tool, Genetic profile-activity relationship (GPAR) was built to help modeling and predicting MOAs easily via deep learning. The users can use GPAR to customize their training sets to train self-defined MOA prediction models, to evaluate the model performances and to make further predictions automatically. Cross-validation tests show GPAR outperforms Gene set enrichment analysis in predicting MOAs. Conclusion GPAR can serve as a better approach in MOAs prediction, which may facilitate researchers to generate more reliable MOA hypothesis.


2020 ◽  
Author(s):  
Tim Becker ◽  
Kevin Yang ◽  
Juan C Caicedo ◽  
Bridget K Wagner ◽  
Vlado C Dancik ◽  
...  

Recent advances in deep learning enable using chemical structures and phenotypic profiles to accurately predict assay results for compounds virtually, reducing the time and cost of screens in the drug discovery process. The relative strength of high-throughput data sources - chemical structures, images (Cell Painting), and gene expression profiles (L1000) - has been unknown. Here we compare their ability to predict the activity of compounds structurally different from those used in training, using a sparse dataset of 16,979 chemicals tested in 376 assays for a total of 542,648 readouts. Deep learning-based feature extraction from chemical structures provided a remarkable ability to predict assay activity for structures dissimilar to those used for training. Image-based profiling performed even better, but requires wet lab experimentation. It outperformed gene expression profiling, and at lower cost. Furthermore, the three profiling modalities are complementary, and together can predict a wide range of diverse bioactivity, including cell-based and biochemical assays. Our study shows that, for many assays, predicting compound activity from phenotypic profiles and chemical structures is an accurate and efficient way to identify potential treatments in the early stages of the drug discovery process.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xuefei Yuan ◽  
Ian C. Scott ◽  
Michael D. Wilson

Bound by lineage-determining transcription factors and signaling effectors, enhancers play essential roles in controlling spatiotemporal gene expression profiles during development, homeostasis and disease. Recent synergistic advances in functional genomic technologies, combined with the developmental biology toolbox, have resulted in unprecedented genome-wide annotation of heart enhancers and their target genes. Starting with early studies of vertebrate heart enhancers and ending with state-of-the-art genome-wide enhancer discovery and testing, we will review how studying heart enhancers in metazoan species has helped inform our understanding of cardiac development and disease.


2021 ◽  
Author(s):  
Christos Fotis ◽  
George Alevizos ◽  
Nikolaos Meimetis ◽  
Christina Koleri ◽  
Thomas Gkekas ◽  
...  

The analysis and comparison of compounds' transcriptomic signatures can help elucidate a compound's Mechanism of Action (MoA) in a biological system. In order to take into account the complexity of the biological system, several computational methods have been developed that utilize prior knowledge of molecular interactions to create a signaling network representation that best explains the compound's effect. However, due to their complex structure, large scale datasets of compound-induced signaling networks and methods specifically tailored to their analysis and comparison are very limited. Our goal is to develop graph deep learning models that are optimized to transform compound-induced signaling networks into high-dimensional representations and investigate their relationship with their respective MoAs. We created a new dataset of compound-induced signaling networks by applying the CARNIVAL network creation pipeline on the gene expression profiles of the CMap dataset. Furthermore, we developed a novel unsupervised graph deep learning pipeline, called deepSNEM, to encode the information in the compound-induced signaling networks in fixed-length high-dimensional representations. The core of deepSNEM is a graph transformer network, trained to maximize the mutual information between whole-graph and sub-graph representations that belong to similar perturbations. By clustering the deepSNEM embeddings, using the k-means algorithm, we were able to identify distinct clusters that are significantly enriched for mTOR, topoisomerase, HDAC and protein synthesis inhibitors respectively. Additionally, we developed a subgraph importance pipeline and identified important nodes and subgraphs that were found to be directly related to the most prevalent MoA of the assigned cluster. As a use case, deepSNEM was applied on compounds' gene expression profiles from various experimental platforms (MicroArrays and RNA sequencing) and the results indicate that correct hypotheses can be generated regarding their MoA.


2022 ◽  
pp. 1-16
Author(s):  
Eddie Guo ◽  
Pouria Torabi ◽  
Daiva E. Nielsen ◽  
Matthew Pietrosanu

The emergence of precision oncology approaches has begun to inform clinical decision-making in diagnostic, prognostic, and treatment contexts. High-throughput technology has enabled machine learning algorithms to use the molecular characteristics of tumors to generate personalized therapies. However, precision oncology studies have yet to develop a predictive biomarker incorporating pan-cancer gene expression profiles to stratify tumors into similar drug sensitivity profiles. Here we show that a neural network with ten hidden layers accurately classifies pancancer cell lines into two distinct chemotherapeutic response groups based on a pan-drug dataset with 89.0% accuracy (AUC = 0.904). Using unsupervised clustering algorithms, we found a cohort of cell line gene expression data from the Genomics of Drug Sensitivity in Cancer could be clustered into two response groups with significant differences in pan-drug chemotherapeutic sensitivity. After applying the Boruta feature selection algorithm to this dataset, a deep learning model was developed to predict chemotherapeutic response groups. The model’s high classification efficacy validates our hypothesis that cell lines with similar gene expression profiles present similar pan-drug chemotherapeutic sensitivity. This finding provides evidence for the potential use of similar combinatorial biomarkers to select potent candidate drugs that maximize therapeutic response and minimize the cytotoxic burden. Future investigations should aim to recursively subcluster cell lines within the response clusters defined in this study to provide a higher resolution of potential patient response to chemotherapeutics.


Sign in / Sign up

Export Citation Format

Share Document