Clinical Drug Response Prediction by Using a Lq Penalized Network-Constrained Logistic Regression Method

Hai-Hui Huang; Jing-Guo Dai; Yong Liang

doi:10.1159/000495826

Clinical Drug Response Prediction by Using a Lq Penalized Network-Constrained Logistic Regression Method

Cellular Physiology and Biochemistry ◽

10.1159/000495826 ◽

2018 ◽

Vol 51 (5) ◽

pp. 2073-2084 ◽

Cited By ~ 5

Author(s):

Hai-Hui Huang ◽

Jing-Guo Dai ◽

Yong Liang

Keyword(s):

Gene Expression ◽

Logistic Regression ◽

Personalized Medicine ◽

Gene Expression Data ◽

Drug Response ◽

Prediction Models ◽

Response Prediction ◽

Expression Data

Background/Aims: One of the most important impacts of personalized medicine is the connection between patients’ genotypes and their drug responses. Despite a series of studies exploring this relationship, the predictive ability of such analyses still needs to be strengthened. Methods: Here we present the Lq penalized network-constrained logistic regression (Lq-NLR) method to meet this need, in which the predictors are integrated into the gene expression data and biological network knowledge and are combined with a more aggressive penalty function. Response prediction models for two cancer targeting drugs (erlotinib and sorafenib) were developed from gene expression data and IC50 values from a large panel of cancer cell lines by utilizing the proposed approach. Then the drug responders were tested with the baseline tumor gene expression data, yielding an in vivo drug sensitivity prediction. Results: These results demonstrated the high effectiveness of this approach. One of the best results achieved by our method was a correlation of 0.841 between the cell line in vitro drug response and patient’s in vivo drug response. We then applied these two drug prediction models to develop a personalized medicine approach in which the subsequent treatment depends on each patient’s gene-expression profile. Conclusion: The proposed method is much better than the existing approach and can capture a more accurate reflection of the relationship between genotypes and phenotypes.

Download Full-text

Graph Convolutional Network for Drug Response Prediction Using Gene Expression Data

Mathematics ◽

10.3390/math9070772 ◽

2021 ◽

Vol 9 (7) ◽

pp. 772

Author(s):

Seonghun Kim ◽

Seockhun Bae ◽

Yinhua Piao ◽

Kyuri Jo

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Drug Response ◽

Response Prediction ◽

Biological Data ◽

Expression Data ◽

Convolutional Network ◽

Essential Information ◽

Protein Protein Interaction

Genomic profiles of cancer patients such as gene expression have become a major source to predict responses to drugs in the era of personalized medicine. As large-scale drug screening data with cancer cell lines are available, a number of computational methods have been developed for drug response prediction. However, few methods incorporate both gene expression data and the biological network, which can harbor essential information about the underlying process of the drug response. We proposed an analysis framework called DrugGCN for prediction of Drug response using a Graph Convolutional Network (GCN). DrugGCN first generates a gene graph by combining a Protein-Protein Interaction (PPI) network and gene expression data with feature selection of drug-related genes, and the GCN model detects the local features such as subnetworks of genes that contribute to the drug response by localized filtering. We demonstrated the effectiveness of DrugGCN using biological data showing its high prediction accuracy among the competing methods.

Download Full-text

MOLI: multi-omics late integration with deep neural networks for drug response prediction

Bioinformatics ◽

10.1093/bioinformatics/btz318 ◽

2019 ◽

Vol 35 (14) ◽

pp. i501-i509 ◽

Cited By ~ 28

Author(s):

Hossein Sharifi-Noghabi ◽

Olga Zolotareva ◽

Colin C Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Prediction Accuracy ◽

Drug Response ◽

Deep Neural Networks ◽

Response Prediction ◽

Supplementary Information ◽

Precision Oncology

Abstract Motivation Historically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance. Results We propose MOLI, a multi-omics late integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding sub-networks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each other and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI’s performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI’s high predictive power suggests it may have utility in precision oncology. Availability and implementation https://github.com/hosseinshn/MOLI. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Discovery and Validation of Novel Methylation Markers in Helicobacter pylori-Associated Gastric Cancer

Disease Markers ◽

10.1155/2021/4391133 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Huan Wang ◽

Nian-Shuang Li ◽

Cong He ◽

Chuan Xie ◽

Yin Zhu ◽

...

Keyword(s):

Gene Expression ◽

Gastric Cancer ◽

Dna Methylation ◽

Helicobacter Pylori ◽

Gene Expression Data ◽

Functional Enrichment ◽

Expression Data ◽

H Pylori

Previous studies have shown that abnormal methylation is an early key event in the pathogenesis of most human cancers, contributing to the development of tumors. However, little attention has been given to the potential of DNA methylation patterns as markers for Helicobacter pylori- (H. pylori-) associated gastric cancer (GC). In this study, an integrated analysis of DNA methylation and gene expression was conducted to identify some potential key epigenetic markers in H. pylori-associated GC. DNA methylation data of 28 H. pylori-positive and 168 H. pylori-negative GC samples were compared and analyzed. We also analyzed the gene expression data of 18 H. pylori-positive and 145 H. pylori-negative GC cases. Finally, the results were verified by in vitro and in vivo experiments. A total of 5609 differentially methylated regions associated with 2454 differentially methylated genes were identified. A total of 228 differentially expressed genes were identified from the gene expression data of H. pylori-positive and H. pylori-negative GC cases. The screened genes were analyzed for functional enrichment. Subsequently, we obtained 28 genes regulated by methylation through a Venn diagram, and we identified five genes (GSTO2, HUS1, INTS1, TMEM184A, and TMEM190) downregulated by hypermethylation. HUS1, GSTO2, and TMEM190 were expressed at lower levels in GC than in adjacent samples ( P < 0.05 ). Moreover, H. pylori infection decreased HUS1, GSTO2, and TMEM190 expression in vitro and in vivo. Our study identified HUS1, GSTO2, and TMEM190 as novel methylation markers for H. pylori-associated GC.

Download Full-text

Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction

Nature Machine Intelligence ◽

10.1038/s42256-021-00408-w ◽

2021 ◽

Author(s):

Hossein Sharifi-Noghabi ◽

Parsa Alamzadeh Harjandi ◽

Olga Zolotareva ◽

Colin C. Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Drug Response ◽

Response Prediction ◽

Expression Data

Download Full-text

MOLI: Multi-Omics Late Integration with deep neural networks for drug response prediction

10.1101/531327 ◽

2019 ◽

Author(s):

Hossein Sharifi-Noghabi ◽

Olga Zolotareva ◽

Colin C. Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Prediction Accuracy ◽

Drug Response ◽

Deep Neural Networks ◽

Response Prediction ◽

Precision Oncology ◽

Early Integration

AbstractMotivationHistorically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance.ResultsWe propose MOLI, a Multi-Omics Late Integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration, and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding subnetworks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI’s performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI’s high predictive power suggests it may have utility in precision oncology.Availability of the implemented codeshttps://github.com/hosseinshn/[email protected] and [email protected]

Download Full-text

In vitro versus in vivo models of kidney fibrosis: Time-course experimental design is crucial to avoid misinterpretations of gene expression data

Journal of Research in Medical Sciences ◽

10.4103/jrms.jrms_906_19 ◽

2020 ◽

Vol 25 (1) ◽

pp. 84

Author(s):

Yousof Gheisari ◽

Shiva Moein ◽

Kobra Moradzadeh ◽

ShaghayeghHaghjooy Javanmard ◽

SeyedMahdi Nasiri

Keyword(s):

Gene Expression ◽

Experimental Design ◽

Gene Expression Data ◽

Time Course ◽

Expression Data ◽

In Vivo Models ◽

Kidney Fibrosis

Download Full-text

Velodrome: Out-of-Distribution Generalization from Labeled and Unlabeled Gene Expression Data for Drug Response Prediction

10.1101/2021.05.25.445658 ◽

2021 ◽

Author(s):

Hossein Sharifi-Noghabi ◽

Parsa Alamzadeh Harjandi ◽

Olga Zolotareva ◽

Colin C Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Transfer Learning ◽

Cell Lines ◽

Gene Expression Data ◽

Drug Response ◽

Response Prediction ◽

Fine Tuning ◽

Expression Data ◽

Target Domain ◽

Data Discrepancy

Data discrepancy between preclinical and clinical datasets poses a major challenge for accurate drug response prediction based on gene expression data. Different methods of transfer learning have been proposed to address this data discrepancy. These methods generally use cell lines as source domains and patients, patient-derived xenografts, or other cell lines as target domains. However, they assume that they have access to the target domain during training or fine-tuning and they can only take labeled source domains as input. The former is a strong assumption that is not satisfied during deployment of these models in the clinic. The latter means these methods rely on labeled source domains which are of limited size. To avoid this assumption, we formulate drug response prediction as an out-of-distribution generalization problem which does not assume that the target domain is accessible during training. Moreover, to exploit unlabeled source domain data, which tends to be much more plentiful than labeled data, we adopt a semi-supervised approach. We propose Velodrome, a semi-supervised method of out-of-distribution generalization that takes labeled and unlabeled data from different resources as input and makes generalizable predictions. Velodrome achieves this goal by introducing an objective function that combines a supervised loss for accurate prediction, an alignment loss for generalization, and a consistency loss to incorporate unlabeled samples. Our experimental results demonstrate that Velodrome outperforms state-of-the-art pharmacogenomics and transfer learning baselines on cell lines, patient-derived xenografts, and patients and therefore, may guide precision oncology more accurately.

Download Full-text

Assessing Chemical-Induced Liver Injury In Vivo From In Vitro Gene Expression Data in the Rat: The Case of Thioacetamide Toxicity

Frontiers in Genetics ◽

10.3389/fgene.2019.01233 ◽

2019 ◽

Vol 10 ◽

Cited By ~ 3

Author(s):

Patric Schyman ◽

Richard L. Printz ◽

Shanea K. Estes ◽

Tracy P. O’Brien ◽

Masakazu Shiota ◽

...

Keyword(s):

Gene Expression ◽

Liver Injury ◽

Gene Expression Data ◽

Expression Data

Download Full-text

In vitro gene expression data supporting a DNA non-reactive genotoxic mechanism for ochratoxin A

Toxicology and Applied Pharmacology ◽

10.1016/j.taap.2007.01.008 ◽

2007 ◽

Vol 220 (2) ◽

pp. 216-224 ◽

Cited By ~ 43

Author(s):

Leire Arbillaga ◽

Amaia Azqueta ◽

Joost H.M. van Delft ◽

Adela López de Cerain

Keyword(s):

Gene Expression ◽

Ochratoxin A ◽

Gene Expression Data ◽

Expression Data

Download Full-text

Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2015/259474 ◽

2015 ◽

Vol 2015 ◽

pp. 1-7 ◽

Cited By ~ 2

Author(s):

Shuhei Kaneko ◽

Akihiro Hirakawa ◽

Chikuma Hamada

Keyword(s):

Gene Expression ◽

Prediction Model ◽

Gene Expression Data ◽

Prediction Models ◽

Mixture Distribution ◽

Tuning Parameter ◽

Survival Prediction ◽

Expression Data ◽

True Positive ◽

False Negatives

In the past decade, researchers in oncology have sought to develop survival prediction models using gene expression data. The least absolute shrinkage and selection operator (lasso) has been widely used to select genes that truly correlated with a patient’s survival. The lasso selects genes for prediction by shrinking a large number of coefficients of the candidate genes towards zero based on a tuning parameter that is often determined by a cross-validation (CV). However, this method can pass over (or fail to identify) true positive genes (i.e., it identifies false negatives) in certain instances, because the lasso tends to favor the development of a simple prediction model. Here, we attempt to monitor the identification of false negatives by developing a method for estimating the number of true positive (TP) genes for a series of values of a tuning parameter that assumes a mixture distribution for the lasso estimates. Using our developed method, we performed a simulation study to examine its precision in estimating the number of TP genes. Additionally, we applied our method to a real gene expression dataset and found that it was able to identify genes correlated with survival that a CV method was unable to detect.

Download Full-text