Improving prediction of compound function from chemical structure using chemical-genetic networks

Mapping Intimacies ◽

10.1101/112698 ◽

2017 ◽

Cited By ~ 2

Author(s):

Hamid Safizadeh ◽

Scott W. Simpkins ◽

Justin Nelson ◽

Chad L. Myers

Keyword(s):

Machine Learning ◽

Biological Activity ◽

Chemical Structure ◽

Genetic Interaction ◽

Structural Similarity ◽

Superior Performance ◽

Interaction Data ◽

Similarity Coefficients ◽

Chemical Genetic ◽

Genetic Interaction Data

ABSTRACTThe drug discovery process can be significantly improved through understanding how the structure of chemical compounds relates to their function. A common paradigm that has been used to filter and prioritize compounds is ligand-based virtual screening, where large libraries of compounds are queried for high structural similarity to a target molecule, with the assumption that structural similarity is predictive of similar biological activity. Although the chemical informatics community has already proposed a wide range of structure descriptors and similarity coefficients, a major challenge has been the lack of systematic and unbiased benchmarks for biological activity that covers a broad range of targets to definitively assess the performance of the alternative approaches.We leveraged a large set of chemical-genetic interaction data from the yeast Saccharomyces cerevisiae that our labs have recently generated, covering more than 13,000 compounds from the RIKEN NPDepo and several NCI, NIH, and GlaxoSmithKline (GSK) compound collections. Supportive of the idea that chemical-genetic interaction data provide an unbiased proxy for biological functions, we found that many commonly used structural similarity measures were able to predict the compounds that exhibited similar chemical-genetic interaction profiles, although these measures did exhibit significant differences in performance. Using the chemical-genetic interaction profiles as a basis for our evaluation, we performed a systematic benchmarking of 10 different structure descriptors, each combined with 12 different similarity coefficients. We found that the All-Shortest Path (ASP) structure descriptor paired with the Braun-Blanquet similarity coefficient provided superior performance that was robust across several different compound collections.We further describe a machine learning approach that improves the ability of the ASP metric to capture biological activity. We used the ASP fingerprints as input for several supervised machine learning models and the chemical-genetic interaction profiles as the standard for learning. We found that the predictive power of the ASP fingerprints (as well as several other descriptors) could be substantially improved by using support vector machines. For example, on held-out data, we measured a 5-fold improvement in the recall of biologically similar compounds at a precision of 50% based upon the ASP fingerprints. Our results generally suggest that using high-dimensional chemical-genetic data as a basis for refining chemical structure descriptors can be a powerful approach to improving prediction of biological function from structure.

Download Full-text

Faculty Opinions recommendation of Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1001725.201877 ◽

2004 ◽

Author(s):

Benoit Coulombe

Keyword(s):

Bioactive Compounds ◽

Genetic Interaction ◽

Interaction Data ◽

Chemical Genetic ◽

Data Links ◽

Cellular Target ◽

Genetic Interaction Data

Download Full-text

MOSAIC: a chemical-genetic interaction data repository and web resource for exploring chemical modes of action

Bioinformatics ◽

10.1093/bioinformatics/btx732 ◽

2017 ◽

Vol 34 (7) ◽

pp. 1251-1252 ◽

Cited By ~ 8

Author(s):

Justin Nelson ◽

Scott W Simpkins ◽

Hamid Safizadeh ◽

Sheena C Li ◽

Jeff S Piotrowski ◽

...

Keyword(s):

Genetic Interaction ◽

Data Repository ◽

Interaction Data ◽

Chemical Genetic ◽

Web Resource ◽

Modes Of Action ◽

Genetic Interaction Data

Download Full-text

Faculty Opinions recommendation of Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1001725.201541 ◽

2004 ◽

Author(s):

Andrew Emili

Keyword(s):

Bioactive Compounds ◽

Genetic Interaction ◽

Interaction Data ◽

Chemical Genetic ◽

Data Links ◽

Cellular Target ◽

Genetic Interaction Data

Download Full-text

Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways

Nature Biotechnology ◽

10.1038/nbt919 ◽

2003 ◽

Vol 22 (1) ◽

pp. 62-69 ◽

Cited By ~ 447

Author(s):

Ainslie B Parsons ◽

Renée L Brost ◽

Huiming Ding ◽

Zhijian Li ◽

Chaoying Zhang ◽

...

Keyword(s):

Bioactive Compounds ◽

Genetic Interaction ◽

Interaction Data ◽

Chemical Genetic ◽

Data Links ◽

Cellular Target ◽

Genetic Interaction Data

Download Full-text

Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship

Wiley Interdisciplinary Reviews Computational Molecular Science ◽

10.1002/wcms.1568 ◽

2021 ◽

Author(s):

Maciej Staszak ◽

Katarzyna Staszak ◽

Karolina Wieszczycka ◽

Anna Bajek ◽

Krzysztof Roszkowski ◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Biological Activity ◽

Drug Design ◽

Chemical Structure ◽

Activity Relationship

Download Full-text

Network Clustering Analysis Using Mixture Exponential-Family Random Graph Models and Its Application in Genetic Interaction Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2017.2743711 ◽

2019 ◽

Vol 16 (5) ◽

pp. 1743-1752

Author(s):

Yishu Wang ◽

Huaying Fang ◽

Dejie Yang ◽

Hongyu Zhao ◽

Minghua Deng

Keyword(s):

Random Graph ◽

Clustering Analysis ◽

Exponential Family ◽

Genetic Interaction ◽

Network Clustering ◽

Interaction Data ◽

Graph Models ◽

Random Graph Models ◽

Genetic Interaction Data

Download Full-text

Maximal Extraction of Biological Information from Genetic Interaction Data

PLoS Computational Biology ◽

10.1371/journal.pcbi.1000347 ◽

2009 ◽

Vol 5 (4) ◽

pp. e1000347 ◽

Cited By ~ 26

Author(s):

Gregory W. Carter ◽

David J. Galas ◽

Timothy Galitski

Keyword(s):

Genetic Interaction ◽

Biological Information ◽

Interaction Data ◽

Genetic Interaction Data

Download Full-text

Automated identification of pathways from quantitative genetic interaction data

Molecular Systems Biology ◽

10.1038/msb.2010.27 ◽

2010 ◽

Vol 6 (1) ◽

pp. 379 ◽

Cited By ~ 55

Author(s):

Alexis Battle ◽

Martin C Jonikas ◽

Peter Walter ◽

Jonathan S Weissman ◽

Daphne Koller

Keyword(s):

Genetic Interaction ◽

Interaction Data ◽

Automated Identification ◽

Quantitative Genetic ◽

Genetic Interaction Data

Download Full-text

Efficient strategies for screening large-scale genetic interaction networks

10.1101/159632 ◽

2017 ◽

Cited By ~ 6

Author(s):

Raamesh Deshpande ◽

Justin Nelson ◽

Scott W. Simpkins ◽

Michael Costanzo ◽

Jeff S. Piotrowski ◽

...

Keyword(s):

Large Scale ◽

Optimal Algorithm ◽

Genetic Interaction ◽

Genetic Screens ◽

Chemical Genetic ◽

Genome Wide ◽

Powerful Approach ◽

Interaction Screening ◽

Genetic Interaction Data

Large-scale genetic interaction screening is a powerful approach for unbiased characterization of gene function and understanding systems-level cellular organization. While genome-wide screens are desirable as they provide the most comprehensive interaction profiles, they are resource and time-intensive and sometimes infeasible, depending on the species and experimental platform. For these scenarios, optimal methods for more efficient screening while still producing the maximal amount of information from the resulting profiles are of interest.To address this problem, we developed an optimal algorithm, called COMPRESS-GI, which selects a small but informative set of genes that captures most of the functional information contained within genome-wide genetic interaction profiles. The utility of this algorithm is demonstrated through an application of the approach to define a diagnostic mutant set for large-scale chemical genetic screens, where more than 13,000 compound screens were achieved through the increased throughput enabled by the approach. COMPRESS-GI can be broadly applied for directing genetic interaction screens in other contexts, including in species with little or no prior genetic-interaction data.

Download Full-text

Improving Measures of Chemical Structural Similarity Using Machine Learning on Chemical–Genetic Interactions

Journal of Chemical Information and Modeling ◽

10.1021/acs.jcim.0c00993 ◽

2021 ◽

Author(s):

Hamid Safizadeh ◽

Scott W. Simpkins ◽

Justin Nelson ◽

Sheena C. Li ◽

Jeff S. Piotrowski ◽

...

Keyword(s):

Machine Learning ◽

Structural Similarity ◽

Genetic Interactions ◽

Chemical Genetic

Download Full-text