FRED Pose Prediction and Virtual Screening Accuracy

Mark McGann

doi:10.1021/ci100436p

Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening Accuracy

Journal of Chemical Information and Modeling ◽

10.1021/ci900056c ◽

2009 ◽

Vol 49 (6) ◽

pp. 1455-1474 ◽

Cited By ~ 272

Author(s):

Jason B. Cross ◽

David C. Thompson ◽

Brajesh K. Rai ◽

J. Christian Baber ◽

Kristi Yi Fan ◽

...

Keyword(s):

Molecular Docking ◽

Virtual Screening ◽

Pose Prediction ◽

Screening Accuracy

Download Full-text

A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability

Molecules ◽

10.3390/molecules24132414 ◽

2019 ◽

Vol 24 (13) ◽

pp. 2414

Author(s):

Weixing Dai ◽

Dianjing Guo

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Learning Algorithm ◽

Screening Method ◽

Chemical Characteristic ◽

High Dimensional ◽

Learning Approaches ◽

Generalization Ability ◽

Model Interpretation ◽

Screening Accuracy

Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC50 value of 0.71 µM.

Download Full-text

A cross docking pipeline for improving pose prediction and virtual screening performance

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-017-0048-z ◽

2017 ◽

Vol 32 (1) ◽

pp. 163-173 ◽

Cited By ~ 13

Author(s):

Ashutosh Kumar ◽

Kam Y. J. Zhang

Keyword(s):

Virtual Screening ◽

Pose Prediction ◽

Cross Docking ◽

Screening Performance ◽

Virtual Screening Performance

Download Full-text

Profile-QSAR 2.0: Kinase Virtual Screening Accuracy Comparable to Four-Concentration IC50s for Realistically Novel Compounds

Journal of Chemical Information and Modeling ◽

10.1021/acs.jcim.7b00166 ◽

2017 ◽

Vol 57 (8) ◽

pp. 2077-2088 ◽

Cited By ~ 22

Author(s):

Eric J. Martin ◽

Valery R. Polyakov ◽

Li Tian ◽

Rolando C. Perez

Keyword(s):

Virtual Screening ◽

Screening Accuracy

Download Full-text

ChemInform Abstract: Structure-Based Virtual Screening with Supervised Consensus Scoring: Evaluation of Pose Prediction and Enrichment Factors.

ChemInform ◽

10.1002/chin.200829205 ◽

2008 ◽

Vol 39 (29) ◽

Author(s):

Reiji Teramoto ◽

Hiroaki Fukunishi

Keyword(s):

Virtual Screening ◽

Enrichment Factors ◽

Pose Prediction ◽

Consensus Scoring ◽

Abstract Structure

Download Full-text

Docking-based virtual screening of TβR1 inhibitors: evaluation of pose prediction and scoring functions

BMC Chemistry ◽

10.1186/s13065-020-00704-3 ◽

2020 ◽

Vol 14 (1) ◽

Author(s):

Shuai Wang ◽

Jun-Hao Jiang ◽

Ruo-Yu Li ◽

Ping Deng

Keyword(s):

Virtual Screening ◽

Pose Prediction ◽

Scoring Functions

Download Full-text

Leveraging nonstructural data to predict structures and affinities of protein–ligand complexes

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2112621118 ◽

2021 ◽

Vol 118 (51) ◽

pp. e2112621118

Author(s):

Joseph M. Paggi ◽

Julia A. Belk ◽

Scott A. Hollingsworth ◽

Nicolas Villanueva ◽

Alexander S. Powers ◽

...

Keyword(s):

Virtual Screening ◽

Drug Targets ◽

3D Structure ◽

Three Dimensional ◽

Rational Drug Design ◽

Screening Methods ◽

Pose Prediction ◽

Learning Approaches ◽

Sources Of Information ◽

Drug Candidates

Over the past five decades, tremendous effort has been devoted to computational methods for predicting properties of ligands—i.e., molecules that bind macromolecular targets. Such methods, which are critical to rational drug design, fall into two categories: physics-based methods, which directly model ligand interactions with the target given the target’s three-dimensional (3D) structure, and ligand-based methods, which predict ligand properties given experimental measurements for similar ligands. Here, we present a rigorous statistical framework to combine these two sources of information. We develop a method to predict a ligand’s pose—the 3D structure of the ligand bound to its target—that leverages a widely available source of information: a list of other ligands that are known to bind the same target but for which no 3D structure is available. This combination of physics-based and ligand-based modeling improves pose prediction accuracy across all major families of drug targets. Using the same framework, we develop a method for virtual screening of drug candidates, which outperforms standard physics-based and ligand-based virtual screening methods. Our results suggest broad opportunities to improve prediction of various ligand properties by combining diverse sources of information through customized machine-learning approaches.

Download Full-text

Handling Imbalance Classification Virtual Screening Big Data Using Machine Learning Algorithms

Complexity ◽

10.1155/2021/6675279 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Sahar K. Hussin ◽

Salah M. Abdelmageid ◽

Adel Alkhalil ◽

Yasser M. Omar ◽

Mahmoud I. Marie ◽

...

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Predictive Accuracy ◽

Specific Protein ◽

Machine Learning Algorithms ◽

Large Set ◽

High Dimensions ◽

Screening Process ◽

Screening Accuracy ◽

Critical Process

Virtual screening is the most critical process in drug discovery, and it relies on machine learning to facilitate the screening process. It enables the discovery of molecules that bind to a specific protein to form a drug. Despite its benefits, virtual screening generates enormous data and suffers from drawbacks such as high dimensions and imbalance. This paper tackles data imbalance and aims to improve virtual screening accuracy, especially for a minority dataset. For a dataset identified without considering the data’s imbalanced nature, most classification methods tend to have high predictive accuracy for the majority category. However, the accuracy was significantly poor for the minority category. The paper proposes a K-mean algorithm coupled with Synthetic Minority Oversampling Technique (SMOTE) to overcome the problem of imbalanced datasets. The proposed algorithm is named as KSMOTE. Using KSMOTE, minority data can be identified at high accuracy and can be detected at high precision. A large set of experiments were implemented on Apache Spark using numeric PaDEL and fingerprint descriptors. The proposed solution was compared to both no-sampling method and SMOTE on the same datasets. Experimental results showed that the proposed solution outperformed other methods.

Download Full-text

Consensus Docking in Drug Discovery

Current Bioactive Compounds ◽

10.2174/1573407214666181023114820 ◽

2020 ◽

Vol 16 (3) ◽

pp. 182-190 ◽

Cited By ~ 1

Author(s):

Giulio Poli ◽

Tiziano Tuccinardi

Keyword(s):

Virtual Screening ◽

Drug Design ◽

Binding Mode ◽

Point Of View ◽

Pose Prediction ◽

Scoring Functions ◽

Computer Aided Drug Design ◽

Successful Case ◽

Computer Aided ◽

High Level

Background: Molecular docking is probably the most popular and profitable approach in computer-aided drug design, being the staple technique for predicting the binding mode of bioactive compounds and for performing receptor-based virtual screening studies. The growing attention received by docking, as well as the need for improving its reliability in pose prediction and virtual screening performance, has led to the development of a wide plethora of new docking algorithms and scoring functions. Nevertheless, it is unlikely to identify a single procedure outperforming the other ones in terms of reliability and accuracy or demonstrating to be generally suitable for all kinds of protein targets. Methods: In this context, consensus docking approaches are taking hold in computer-aided drug design. These computational protocols consist in docking ligands using multiple docking methods and then comparing the binding poses predicted for the same ligand by the different methods. This analysis is usually carried out calculating the root-mean-square deviation among the different docking results obtained for each ligand, in order to identify the number of docking methods producing the same binding pose. Results: The consensus docking approaches demonstrated to improve the quality of docking and virtual screening results compared to the single docking methods. From a qualitative point of view, the improvement in pose prediction accuracy was obtained by prioritizing ligand binding poses produced by a high number of docking methods, whereas with regards to virtual screening studies, high hit rates were obtained by prioritizing the compounds showing a high level of pose consensus. Conclusion: In this review, we provide an overview of the results obtained from the performance assessment of various consensus docking protocols and we illustrate successful case studies where consensus docking has been applied in virtual screening studies.

Download Full-text

Electrostatic-field and surface-shape similarity for virtual screening and pose prediction

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-019-00236-6 ◽

2019 ◽

Vol 33 (10) ◽

pp. 865-886 ◽

Cited By ~ 4

Author(s):

Ann E. Cleves ◽

Stephen R. Johnson ◽

Ajay N. Jain

Keyword(s):

Hydrogen Bonding ◽

Virtual Screening ◽

Success Rate ◽

Electrostatic Field ◽

Rational Design ◽

Surface Shape ◽

Molecular Surface ◽

Pose Prediction ◽

Screening Performance ◽

Roc Area

Abstract We introduce a new method for rapid computation of 3D molecular similarity that combines electrostatic field comparison with comparison of molecular surface-shape and directional hydrogen-bonding preferences (called “eSim”). Rather than employing heuristic “colors” or user-defined molecular feature types to represent conformation-dependent molecular electrostatics, eSim calculates the similarity of the electrostatic fields of two molecules (in addition to shape and hydrogen-bonding). We present detailed virtual screening performance data on the standard 102 target DUD-E set. In its moderately fast screening mode, eSim running on a single computing core is capable of processing over 60 molecules per second. In this mode, eSim performed significantly better than all alternate methods for which full DUD-E data were available (mean ROC area of 0.74, p $$< 10^{-9}$$<10-9, by paired t-test, compared with the best performing alternate method). In addition, for 92 targets of the DUD-E set where multiple ligand-bound crystal structures were available, screening performance was assessed using alternate ligands or sets thereof (in their bound poses) as similarity targets. Using the joint alignment of five ligands for each protein target, mean ROC area exceeded 0.82 for the 92 targets. Design-focused application of ligand similarity methods depends on accurate predictions of geometric molecular relationships. We comprehensively assessed pose prediction accuracy by curating nearly 400,000 bound ligand pose pairs across the DUD-E targets. Overall, beginning from agnostic initial poses, we observed an 80% success rate for RMSD $$\le 2.0$$≤2.0 Å among the top 20 predicted eSim poses. These examples were split roughly 50/50 into cases with high direct atomic overlap (where a shared scaffold exists between a pair) and low direct atomic overlap (where where a ligand pair has dissimilar scaffolds but largely occupies the same space). Within the high direct atomic overlap subset, the pose prediction success rate was 93%. For the more challenging subset (where dissimilar scaffolds are to be aligned), the success rate was 70%. The eSim approach enables both large-scale screening and rational design of ligands and is rooted in physically meaningful, non-heuristic, molecular comparisons.

Download Full-text