scholarly journals Chemi-Net: A Molecular Graph Convolutional Network for Accurate Drug Property Prediction

2019 ◽  
Vol 20 (14) ◽  
pp. 3389 ◽  
Author(s):  
Ke Liu ◽  
Xiangyan Sun ◽  
Lei Jia ◽  
Jun Ma ◽  
Haoming Xing ◽  
...  

Absorption, distribution, metabolism, and excretion (ADME) studies are critical for drug discovery. Conventionally, these tasks, together with other chemical property predictions, rely on domain-specific feature descriptors, or fingerprints. Following the recent success of neural networks, we developed Chemi-Net, a completely data-driven, domain knowledge-free, deep learning method for ADME property prediction. To compare the relative performance of Chemi-Net with Cubist, one of the popular machine learning programs used by Amgen, a large-scale ADME property prediction study was performed on-site at Amgen. For all 13 data sets, Chemi-Net resulted in higher R2 values compared with the Cubist benchmark. The median R2 increase rate over Cubist was 26.7%. We expect that the significantly increased accuracy of ADME prediction seen with Chemi-Net over Cubist will greatly accelerate drug discovery.

2021 ◽  
Author(s):  
Shengchen Jiang ◽  
Hongbin Wang ◽  
Xiang Hou

Abstract The existing methods ignore the adverse effect of knowledge graph incompleteness on knowledge graph embedding. In addition, the complexity and large-scale of knowledge information hinder knowledge graph embedding performance of the classic graph convolutional network. In this paper, we analyzed the structural characteristics of knowledge graph and the imbalance of knowledge information. Complex knowledge information requires that the model should have better learnability, rather than linearly weighted qualitative constraints, so the method of end-to-end relation-enhanced learnable graph self-attention network for knowledge graphs embedding is proposed. Firstly, we construct the relation-enhanced adjacency matrix to consider the incompleteness of the knowledge graph. Secondly, the graph self-attention network is employed to obtain the global encoding and relevance ranking of entity node information. Thirdly, we propose the concept of convolutional knowledge subgraph, it is constructed according to the entity relevance ranking. Finally, we improve the training effect of the convKB model by changing the construction of negative samples to obtain a better reliability score in the decoder. The experimental results based on the data sets FB15k-237 and WN18RR show that the proposed method facilitates more comprehensive representation of knowledge information than the existing methods, in terms of Hits@10 and MRR.


2008 ◽  
Vol 03 (01n02) ◽  
pp. 19-42
Author(s):  
PETER A. DIMAGGIO ◽  
SCOTT R. MCALLISTER ◽  
CHRISTODOULOS A. FLOUDAS ◽  
XIAO-JIANG FENG ◽  
JOSHUA D. RABINOWITZ ◽  
...  

The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Many of the methods developed employ local search or heuristic strategies for identifying the "best" arrangement of features according to some metric. In this article, we present rigorous clustering methods based on the optimal re-ordering of data matrices. Distinct mixed-integer linear programming (MILP) models are utilized for the clustering of (a) dense data matrices, such as gene expression data, and (b) sparse data matrices, which are commonly encountered in the field of drug discovery. Both methods can be used in an iterative framework to bicluster data and assist in the synthesis of drug compounds, respectively. We demonstrate the capability of the proposed optimal re-ordering methods on several data sets from both systems biology and molecular discovery studies and compare our results to other clustering techniques when applicable.


2017 ◽  
Author(s):  
Carlos Arteta ◽  
Victor Lempitsky ◽  
Jaroslav Zak ◽  
Xin Lu ◽  
J. Alison Noble ◽  
...  

AbstractHigh-throughput screening (HTS) techniques have enabled large scale image-based studies, but extracting biological insights from the imaging data in an exploratory setting remains a challenge. Existing packages for this task either require expert annotations, which can bias the outcome of the study, or are completely unsupervised, failing to leverage the information present in the assay design. We present HTX, an interactive tool to aid in the exploration of large microscopy data sets by allowing the visualization of entire image-based assays according to visual similarities between the samples in an intuitive and navigable manner. Underlying HTX are a collection of novel algorithmic techniques for deep texture descriptor learning, 2D data visualization, adversarial suppression of batch effects, and backprop-based image saliency estimation.We demonstrate that HTX can exploit the screen meta-data in order to learn screen-specific image descriptors, which are then used to quantify the visual similarity between samples in the assay. Given these similarities and the different visualization resources of HTX, it is shown that screens of small-molecule libraries on cell data can be easily explored, reproducing the results of previous studies where highly-specific domain knowledge was required.


Author(s):  
Christina Schindler ◽  
Hannah Baumann ◽  
Andreas Blum ◽  
Dietrich Böse ◽  
Hans-Peter Buchstaller ◽  
...  

Here we present an evaluation of the binding affinity prediction accuracy of the free energy calculation method FEP+ on internal active drug discovery projects and on a large new public benchmark set.<br>


2019 ◽  
Author(s):  
Wengong Jin ◽  
Regina Barzilay ◽  
Tommi S Jaakkola

The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties. Our work in this paper substantially extends prior state-of-the-art on graph-to-graph translation methods for molecular optimization. In particular, we realize coherent multi-resolution representations by interweaving trees over substructures with the atom-level encoding of the original molecular graph. Moreover, our graph decoder is fully autoregressive, and interleaves each step of adding a new substructure with the process of resolving its connectivity to the emerging molecule. We evaluate our model on multiple molecular optimization tasks and show that our model outperforms previous state-of-the-art baselines by a large margin.


2019 ◽  
Author(s):  
Kyle Konze ◽  
Pieter Bos ◽  
Markus Dahlgren ◽  
Karl Leswing ◽  
Ivan Tubert-Brohman ◽  
...  

We report a new computational technique, PathFinder, that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. Coupling PathFinder with active learning and cloud-based free energy calculations allows for large-scale potency predictions of compounds on a timescale that impacts drug discovery. The process is further accelerated by using a combination of population-based statistics and active learning techniques. Using this approach, we rapidly optimized R-groups and core hops for inhibitors of cyclin-dependent kinase 2. We explored greater than 300 thousand ideas and identified 35 ligands with diverse commercially available R-groups and a predicted IC<sub>50</sub> < 100 nM, and four unique cores with a predicted IC<sub>50</sub> < 100 nM. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.


2019 ◽  
Author(s):  
Kyle Konze ◽  
Pieter Bos ◽  
Markus Dahlgren ◽  
Karl Leswing ◽  
Ivan Tubert-Brohman ◽  
...  

We report a new computational technique, PathFinder, that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. Coupling PathFinder with active learning and cloud-based free energy calculations allows for large-scale potency predictions of compounds on a timescale that impacts drug discovery. The process is further accelerated by using a combination of population-based statistics and active learning techniques. Using this approach, we rapidly optimized R-groups and core hops for inhibitors of cyclin-dependent kinase 2. We explored greater than 300 thousand ideas and identified 35 ligands with diverse commercially available R-groups and a predicted IC<sub>50</sub> < 100 nM, and four unique cores with a predicted IC<sub>50</sub> < 100 nM. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.


2019 ◽  
Vol 19 (1) ◽  
pp. 4-16 ◽  
Author(s):  
Qihui Wu ◽  
Hanzhong Ke ◽  
Dongli Li ◽  
Qi Wang ◽  
Jiansong Fang ◽  
...  

Over the past decades, peptide as a therapeutic candidate has received increasing attention in drug discovery, especially for antimicrobial peptides (AMPs), anticancer peptides (ACPs) and antiinflammatory peptides (AIPs). It is considered that the peptides can regulate various complex diseases which are previously untouchable. In recent years, the critical problem of antimicrobial resistance drives the pharmaceutical industry to look for new therapeutic agents. Compared to organic small drugs, peptide- based therapy exhibits high specificity and minimal toxicity. Thus, peptides are widely recruited in the design and discovery of new potent drugs. Currently, large-scale screening of peptide activity with traditional approaches is costly, time-consuming and labor-intensive. Hence, in silico methods, mainly machine learning approaches, for their accuracy and effectiveness, have been introduced to predict the peptide activity. In this review, we document the recent progress in machine learning-based prediction of peptides which will be of great benefit to the discovery of potential active AMPs, ACPs and AIPs.


2021 ◽  
Vol 11 (10) ◽  
pp. 4426
Author(s):  
Chunyan Ma ◽  
Ji Fan ◽  
Jinghao Yao ◽  
Tao Zhang

Computer vision-based action recognition of basketball players in basketball training and competition has gradually become a research hotspot. However, owing to the complex technical action, diverse background, and limb occlusion, it remains a challenging task without effective solutions or public dataset benchmarks. In this study, we defined 32 kinds of atomic actions covering most of the complex actions for basketball players and built the dataset NPU RGB+D (a large scale dataset of basketball action recognition with RGB image data and Depth data captured in Northwestern Polytechnical University) for 12 kinds of actions of 10 professional basketball players with 2169 RGB+D videos and 75 thousand frames, including RGB frame sequences, depth maps, and skeleton coordinates. Through extracting the spatial features of the distances and angles between the joint points of basketball players, we created a new feature-enhanced skeleton-based method called LSTM-DGCN for basketball player action recognition based on the deep graph convolutional network (DGCN) and long short-term memory (LSTM) methods. Many advanced action recognition methods were evaluated on our dataset and compared with our proposed method. The experimental results show that the NPU RGB+D dataset is very competitive with the current action recognition algorithms and that our LSTM-DGCN outperforms the state-of-the-art action recognition methods in various evaluation criteria on our dataset. Our action classifications and this NPU RGB+D dataset are valuable for basketball player action recognition techniques. The feature-enhanced LSTM-DGCN has a more accurate action recognition effect, which improves the motion expression ability of the skeleton data.


Sign in / Sign up

Export Citation Format

Share Document