FRED Pose Prediction and Virtual Screening Accuracy

2011 ◽  
Vol 51 (3) ◽  
pp. 578-596 ◽  
Author(s):  
Mark McGann
2009 ◽  
Vol 49 (6) ◽  
pp. 1455-1474 ◽  
Author(s):  
Jason B. Cross ◽  
David C. Thompson ◽  
Brajesh K. Rai ◽  
J. Christian Baber ◽  
Kristi Yi Fan ◽  
...  

Molecules ◽  
2019 ◽  
Vol 24 (13) ◽  
pp. 2414
Author(s):  
Weixing Dai ◽  
Dianjing Guo

Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC50 value of 0.71 µM.


2017 ◽  
Vol 57 (8) ◽  
pp. 2077-2088 ◽  
Author(s):  
Eric J. Martin ◽  
Valery R. Polyakov ◽  
Li Tian ◽  
Rolando C. Perez

BMC Chemistry ◽  
2020 ◽  
Vol 14 (1) ◽  
Author(s):  
Shuai Wang ◽  
Jun-Hao Jiang ◽  
Ruo-Yu Li ◽  
Ping Deng

2021 ◽  
Vol 118 (51) ◽  
pp. e2112621118
Author(s):  
Joseph M. Paggi ◽  
Julia A. Belk ◽  
Scott A. Hollingsworth ◽  
Nicolas Villanueva ◽  
Alexander S. Powers ◽  
...  

Over the past five decades, tremendous effort has been devoted to computational methods for predicting properties of ligands—i.e., molecules that bind macromolecular targets. Such methods, which are critical to rational drug design, fall into two categories: physics-based methods, which directly model ligand interactions with the target given the target’s three-dimensional (3D) structure, and ligand-based methods, which predict ligand properties given experimental measurements for similar ligands. Here, we present a rigorous statistical framework to combine these two sources of information. We develop a method to predict a ligand’s pose—the 3D structure of the ligand bound to its target—that leverages a widely available source of information: a list of other ligands that are known to bind the same target but for which no 3D structure is available. This combination of physics-based and ligand-based modeling improves pose prediction accuracy across all major families of drug targets. Using the same framework, we develop a method for virtual screening of drug candidates, which outperforms standard physics-based and ligand-based virtual screening methods. Our results suggest broad opportunities to improve prediction of various ligand properties by combining diverse sources of information through customized machine-learning approaches.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Sahar K. Hussin ◽  
Salah M. Abdelmageid ◽  
Adel Alkhalil ◽  
Yasser M. Omar ◽  
Mahmoud I. Marie ◽  
...  

Virtual screening is the most critical process in drug discovery, and it relies on machine learning to facilitate the screening process. It enables the discovery of molecules that bind to a specific protein to form a drug. Despite its benefits, virtual screening generates enormous data and suffers from drawbacks such as high dimensions and imbalance. This paper tackles data imbalance and aims to improve virtual screening accuracy, especially for a minority dataset. For a dataset identified without considering the data’s imbalanced nature, most classification methods tend to have high predictive accuracy for the majority category. However, the accuracy was significantly poor for the minority category. The paper proposes a K-mean algorithm coupled with Synthetic Minority Oversampling Technique (SMOTE) to overcome the problem of imbalanced datasets. The proposed algorithm is named as KSMOTE. Using KSMOTE, minority data can be identified at high accuracy and can be detected at high precision. A large set of experiments were implemented on Apache Spark using numeric PaDEL and fingerprint descriptors. The proposed solution was compared to both no-sampling method and SMOTE on the same datasets. Experimental results showed that the proposed solution outperformed other methods.


2020 ◽  
Vol 16 (3) ◽  
pp. 182-190 ◽  
Author(s):  
Giulio Poli ◽  
Tiziano Tuccinardi

Background: Molecular docking is probably the most popular and profitable approach in computer-aided drug design, being the staple technique for predicting the binding mode of bioactive compounds and for performing receptor-based virtual screening studies. The growing attention received by docking, as well as the need for improving its reliability in pose prediction and virtual screening performance, has led to the development of a wide plethora of new docking algorithms and scoring functions. Nevertheless, it is unlikely to identify a single procedure outperforming the other ones in terms of reliability and accuracy or demonstrating to be generally suitable for all kinds of protein targets. Methods: In this context, consensus docking approaches are taking hold in computer-aided drug design. These computational protocols consist in docking ligands using multiple docking methods and then comparing the binding poses predicted for the same ligand by the different methods. This analysis is usually carried out calculating the root-mean-square deviation among the different docking results obtained for each ligand, in order to identify the number of docking methods producing the same binding pose. Results: The consensus docking approaches demonstrated to improve the quality of docking and virtual screening results compared to the single docking methods. From a qualitative point of view, the improvement in pose prediction accuracy was obtained by prioritizing ligand binding poses produced by a high number of docking methods, whereas with regards to virtual screening studies, high hit rates were obtained by prioritizing the compounds showing a high level of pose consensus. Conclusion: In this review, we provide an overview of the results obtained from the performance assessment of various consensus docking protocols and we illustrate successful case studies where consensus docking has been applied in virtual screening studies.


2019 ◽  
Vol 33 (10) ◽  
pp. 865-886 ◽  
Author(s):  
Ann E. Cleves ◽  
Stephen R. Johnson ◽  
Ajay N. Jain

Abstract We introduce a new method for rapid computation of 3D molecular similarity that combines electrostatic field comparison with comparison of molecular surface-shape and directional hydrogen-bonding preferences (called “eSim”). Rather than employing heuristic “colors” or user-defined molecular feature types to represent conformation-dependent molecular electrostatics, eSim calculates the similarity of the electrostatic fields of two molecules (in addition to shape and hydrogen-bonding). We present detailed virtual screening performance data on the standard 102 target DUD-E set. In its moderately fast screening mode, eSim running on a single computing core is capable of processing over 60 molecules per second. In this mode, eSim performed significantly better than all alternate methods for which full DUD-E data were available (mean ROC area of 0.74, p $$< 10^{-9}$$<10-9, by paired t-test, compared with the best performing alternate method). In addition, for 92 targets of the DUD-E set where multiple ligand-bound crystal structures were available, screening performance was assessed using alternate ligands or sets thereof (in their bound poses) as similarity targets. Using the joint alignment of five ligands for each protein target, mean ROC area exceeded 0.82 for the 92 targets. Design-focused application of ligand similarity methods depends on accurate predictions of geometric molecular relationships. We comprehensively assessed pose prediction accuracy by curating nearly 400,000 bound ligand pose pairs across the DUD-E targets. Overall, beginning from agnostic initial poses, we observed an 80% success rate for RMSD $$\le 2.0$$≤2.0 Å  among the top 20 predicted eSim poses. These examples were split roughly 50/50 into cases with high direct atomic overlap (where a shared scaffold exists between a pair) and low direct atomic overlap (where where a ligand pair has dissimilar scaffolds but largely occupies the same space). Within the high direct atomic overlap subset, the pose prediction success rate was 93%. For the more challenging subset (where dissimilar scaffolds are to be aligned), the success rate was 70%. The eSim approach enables both large-scale screening and rational design of ligands and is rooted in physically meaningful, non-heuristic, molecular comparisons.


Sign in / Sign up

Export Citation Format

Share Document