SimilarityLab: Molecular Similarity for SAR Exploration and Target Prediction on the Web

The scientists, and the researchers around the globe generate tremendous amount of information everyday; for instance, so far more than 74 million molecules are registered in Chemical Abstract Services. According to a recent study, at present we have around 1060 molecules, which are classified as new drug-like molecules. The library of such molecules is now considered as ‘dark chemical space’ or ‘dark chemistry.’ Now, in order to explore such hidden molecules scientifically, a good number of live and updated databases (protein, cell, tissues, structure, drugs, etc.) are available today. The synchronization of the three different sciences: ‘genomics’, proteomics and ‘in-silico simulation’ will revolutionize the process of drug discovery. The screening of a sizable number of drugs like molecules is a challenge and it must be treated in an efficient manner. Virtual screening (VS) is an important computational tool in the drug discovery process; however, experimental verification of the drugs also equally important for the drug development process. The quantitative structure-activity relationship (QSAR) analysis is one of the machine learning technique, which is extensively used in VS techniques. QSAR is well-known for its high and fast throughput screening with a satisfactory hit rate. The QSAR model building involves (i) chemo-genomics data collection from a database or literature (ii) Calculation of right descriptors from molecular representation (iii) establishing a relationship (model) between biological activity and the selected descriptors (iv) application of QSAR model to predict the biological property for the molecules. All the hits obtained by the VS technique needs to be experimentally verified. The present mini-review highlights: the web-based machine learning tools, the role of QSAR in VS techniques, successful applications of QSAR based VS leading to the drug discovery and advantages and challenges of QSAR.

Download Full-text

SAR by kinetics for drug discovery in protein misfolding diseases

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1807884115 ◽

2018 ◽

Vol 115 (41) ◽

pp. 10245-10250 ◽

Cited By ~ 25

Author(s):

Sean Chia ◽

Johnny Habchi ◽

Thomas C. T. Michaels ◽

Samuel I. A. Cohen ◽

Sara Linse ◽

...

Keyword(s):

Drug Discovery ◽

Small Molecules ◽

Protein Misfolding ◽

Amyloid Beta Peptide ◽

Oligomer Formation ◽

Chemical Derivatives ◽

Structure Activity ◽

Beta Peptide ◽

Misfolding Diseases ◽

Aβ Oligomer

To develop effective therapeutic strategies for protein misfolding diseases, a promising route is to identify compounds that inhibit the formation of protein oligomers. To achieve this goal, we report a structure−activity relationship (SAR) approach based on chemical kinetics to estimate quantitatively how small molecules modify the reactive flux toward oligomers. We use this estimate to derive chemical rules in the case of the amyloid beta peptide (Aβ), which we then exploit to optimize starting compounds to curtail Aβ oligomer formation. We demonstrate this approach by converting an inactive rhodanine compound into an effective inhibitor of Aβ oligomer formation by generating chemical derivatives in a systematic manner. These results provide an initial demonstration of the potential of drug discovery strategies based on targeting directly the production of protein oligomers.

Download Full-text

Synthesis of Imidazole-Based Medicinal Molecules Utilizing the van Leusen Imidazole Synthesis

Pharmaceuticals ◽

10.3390/ph13030037 ◽

2020 ◽

Vol 13 (3) ◽

pp. 37 ◽

Cited By ~ 5

Author(s):

Xunan Zheng ◽

Zhengning Ma ◽

Dawei Zhang

Keyword(s):

Drug Discovery ◽

Chemical Synthesis ◽

Small Molecules ◽

Medicinal Chemistry ◽

Biological Activities ◽

Structural Features ◽

Pharmaceutical Companies ◽

Recent Developments ◽

Van Leusen Reaction

Imidazole and its derivatives are one of the most vital and universal heterocycles in medicinal chemistry. Owing to their special structural features, these compounds exhibit a widespread spectrum of significant pharmacological or biological activities, and are widely researched and applied by pharmaceutical companies for drug discovery. The van Leusen reaction based on tosylmethylisocyanides (TosMICs) is one of the most appropriate strategies to synthetize imidazole-based medicinal molecules, which has been increasingly developed on account of its advantages. In this review, we summarize the recent developments of the chemical synthesis and bioactivity of imidazole-containing medicinal small molecules, utilizing the van Leusen imidazole synthesis from 1977.

Download Full-text

Chemical Patterns of Proteasome Inhibitors: Lessons Learned from Two Decades of Drug Design

International Journal of Molecular Sciences ◽

10.3390/ijms20215326 ◽

2019 ◽

Vol 20 (21) ◽

pp. 5326 ◽

Cited By ~ 3

Author(s):

Guedes ◽

Aniceto ◽

Andrade ◽

Salvador ◽

Guedes

Keyword(s):

Drug Discovery ◽

Chemical Space ◽

Decision Rules ◽

Proteasome Inhibitors ◽

Proteasome Inhibition ◽

Lessons Learned ◽

Drug Candidates ◽

Making Sense ◽

Structure Activity ◽

Future Drug

Drug discovery now faces a new challenge, where the availability of experimental data is no longer the limiting step, and instead, making sense of the data has gained a new level of importance, propelled by the extensive incorporation of cheminformatics and bioinformatics methodologies into the drug discovery and development pipeline. These enable, for example, the inference of structure-activity relationships that can be useful in the discovery of new drug candidates. One of the therapeutic applications that could benefit from this type of data mining is proteasome inhibition, given that multiple compounds have been designed and tested for the last 20 years, and this collection of data is yet to be subjected to such type of assessment. This study presents a retrospective overview of two decades of proteasome inhibitors development (680 compounds), in order to gather what could be learned from them and apply this knowledge to any future drug discovery on this subject. Our analysis focused on how different chemical descriptors coupled with statistical tools can be used to extract interesting patterns of activity. Multiple instances of the structure-activity relationship were observed in this dataset, either for isolated molecular descriptors (e.g., molecular refractivity and topological polar surface area) as well as scaffold similarity or chemical space overlap. Building a decision tree allowed the identification of two meaningful decision rules that describe the chemical parameters associated with high activity. Additionally, a characterization of the prevalence of key functional groups gives insight into global patterns followed in drug discovery projects, and highlights some systematically underexplored parts of the chemical space. The various chemical patterns identified provided useful insight that can be applied in future drug discovery projects, and give an overview of what has been done so far.

Download Full-text

grünifai: interactive multiparameter optimization of molecules in a continuous vector space

Bioinformatics ◽

10.1093/bioinformatics/btaa271 ◽

2020 ◽

Vol 36 (13) ◽

pp. 4093-4094

Author(s):

Robin Winter ◽

Joren Retel ◽

Frank Noé ◽

Djork-Arné Clevert ◽

Andreas Steffen

Keyword(s):

Small Molecules ◽

In Silico ◽

Particle Swarm Optimization Algorithm ◽

Chemical Space ◽

Source Code ◽

Optimization Method ◽

Swarm Optimization ◽

Multiparameter Optimization ◽

In Silico Models ◽

Discovery Project

Abstract Summary Optimizing small molecules in a drug discovery project is a notoriously difficult task as multiple molecular properties have to be considered and balanced at the same time. In this work, we present our novel interactive in silico compound optimization platform termed grünifai to support the ideation of the next generation of compounds under the constraints of a multiparameter objective. grünifai integrates adjustable in silico models, a continuous representation of the chemical space, a scalable particle swarm optimization algorithm and the possibility to actively steer the compound optimization through providing feedback on generated intermediate structures. Availability and implementation Source code and documentation are freely available under an MIT license and are openly available on GitHub (https://github.com/jrwnter/gruenifai). The backend, including the optimization method and distribution on multiple GPU nodes is written in Python 3. The frontend is written in ReactJS.

Download Full-text

Epigenetic Target Profiler: A Web Server to Predict Epigenetic Targets of Small Molecules

10.26434/chemrxiv.13524575 ◽

2021 ◽

Author(s):

Norberto Sánchez-Cruz ◽

Jose L. Medina-Franco

Keyword(s):

Drug Discovery ◽

Small Molecules ◽

Web Application ◽

Binary Classification ◽

Target Prediction ◽

Support Vector ◽

Molecular Fingerprints ◽

Protein Targets ◽

The Public ◽

Epigenetic Drug

Motivation: The identification of protein targets of small molecules is essential for drug discovery. With the increasing amount of chemogenomic data in the public domain, multiple ligand-based models for target prediction have emerged. However, these models are generally biased by the number of known ligands for different targets, which involves an underrepresentation of epigenetic targets. Epigenetic drug discovery is of increasing importance but there are no open tools for epigenetic target prediction. Results: We introduce Epigenetic Target Profiler (ETP), a freely accessible and easy-to-use web application for the prediction of epigenetic targets of small molecules. For a query compound, ETP predicts its bioactivity profile over a panel of 55 different epigenetic targets. To that aim, ETP uses a consensus model based on two binary classification models for each target, relying on support vector machines and built on molecular fingerprints of different design. A distance-to-model parameter related to the reliability of the predictions is included to facilitate their interpretability and assist the identification of small molecules with potential epigenetic activity.

Download Full-text

Epigenetic Target Profiler: A Web Server to Predict Epigenetic Targets of Small Molecules

10.26434/chemrxiv.13524575.v1 ◽

2021 ◽

Author(s):

Norberto Sánchez-Cruz ◽

Jose L. Medina-Franco

Keyword(s):

Drug Discovery ◽

Small Molecules ◽

Web Application ◽

Binary Classification ◽

Target Prediction ◽

Support Vector ◽

Molecular Fingerprints ◽

Protein Targets ◽

The Public ◽

Epigenetic Drug

Motivation: The identification of protein targets of small molecules is essential for drug discovery. With the increasing amount of chemogenomic data in the public domain, multiple ligand-based models for target prediction have emerged. However, these models are generally biased by the number of known ligands for different targets, which involves an underrepresentation of epigenetic targets. Epigenetic drug discovery is of increasing importance but there are no open tools for epigenetic target prediction. Results: We introduce Epigenetic Target Profiler (ETP), a freely accessible and easy-to-use web application for the prediction of epigenetic targets of small molecules. For a query compound, ETP predicts its bioactivity profile over a panel of 55 different epigenetic targets. To that aim, ETP uses a consensus model based on two binary classification models for each target, relying on support vector machines and built on molecular fingerprints of different design. A distance-to-model parameter related to the reliability of the predictions is included to facilitate their interpretability and assist the identification of small molecules with potential epigenetic activity.

Download Full-text

Validation strategies for target prediction methods

Briefings in Bioinformatics ◽

10.1093/bib/bbz026 ◽

2019 ◽

Vol 21 (3) ◽

pp. 791-802 ◽

Cited By ~ 8

Author(s):

Neann Mathai ◽

Ya Chen ◽

Johannes Kirchmair

Keyword(s):

Small Molecules ◽

Large Scale ◽

Predictive Power ◽

Molecular Similarity ◽

Prediction Method ◽

Target Prediction ◽

Structural Data ◽

Prediction Methods ◽

Action Identification ◽

Statistical Validation

Abstract Computational methods for target prediction, based on molecular similarity and network-based approaches, machine learning, docking and others, have evolved as valuable and powerful tools to aid the challenging task of mode of action identification for bioactive small molecules such as drugs and drug-like compounds. Critical to discerning the scope and limitations of a target prediction method is understanding how its performance was evaluated and reported. Ideally, large-scale prospective experiments are conducted to validate the performance of a model; however, this expensive and time-consuming endeavor is often not feasible. Therefore, to estimate the predictive power of a method, statistical validation based on retrospective knowledge is commonly used. There are multiple statistical validation techniques that vary in rigor. In this review we discuss the validation strategies employed, highlighting the usefulness and constraints of the validation schemes and metrics that are employed to measure and describe performance. We address the limitations of measuring only generalized performance, given that the underlying bioactivity and structural data are biased towards certain small-molecule scaffolds and target families, and suggest additional aspects of performance to consider in order to produce more detailed and realistic estimates of predictive power. Finally, we describe the validation strategies that were employed by some of the most thoroughly validated and accessible target prediction methods.

Download Full-text

One Molecular Fingerprint to Rule them All: Drugs, Biomolecules, and the Metabolome

10.26434/chemrxiv.11994630.v1 ◽

2020 ◽

Author(s):

Alice Capecchi ◽

Daniel Probst ◽

Jean-Louis Reymond

Keyword(s):

Small Molecules ◽

Similarity Search ◽

Nearest Neighbor ◽

Chemical Space ◽

Source Code ◽

Atom Pair ◽

Molecular Fingerprint ◽

Molecular Fingerprints ◽

Large Molecules ◽

Different Types

Background: Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure fingerprints perform best for small molecules such as drugs, while atom-pair fingerprints are preferable for large molecules such as peptides. However, no available fingerprint achieves good performance on both classes of molecules. Results: Here we set out to design a new fingerprint suitable for both small and large molecules by combining substructure and atom-pair concepts. Our quest resulted in a new fingerprint called MinHashed atom-pair fingerprint up to a diameter of four bonds (MAP4). In this fingerprint the circular substructures with radii of r = 1 and r = 2 bonds around each atom in an atom-pair are written as two pairs of SMILES, each pair being combined with the topological distance separating the two central atoms. These so-called atom-pair molecular shingles are hashed, and the resulting set of hashes is MinHashed to form the MAP4 fingerprint. MAP4 significantly outperforms all other fingerprints on an extended benchmark that combines the Riniker and Landrum small molecule benchmark with a peptide benchmark recovering BLAST analogs from either scrambled or point mutation analogs. MAP4 furthermore produces well-organized chemical space tree-maps (TMAPs) for databases as diverse as DrugBank, ChEMBL, SwissProt and the Human Metabolome Database (HMBD), and differentiates between all metabolites in HMBD, over 70 % of which are indistinguishable from their nearest neighbor using substructure fingerprints. Conclusion: MAP4 is a new molecular fingerprint suitable for drugs, biomolecules, and the metabolome and can be adopted as a universal fingerprint to describe and search chemical space. The source code is available at <a href="https://github.com/reymond-group/map4">https://github.com/reymond-group/map4</a> and interactive MAP4 similarity search tools and TMAPs for various databases are accessible at <a href="http://map-search.gdb.tools/">http://map-search.gdb.tools/</a> and <a href="http://tm.gdb.tools/map4/">http://tm.gdb.tools/map4/</a>.<a href="http://tm.gdb.tools/map4/"></a>

Download Full-text

Assays and technologies for developing proteolysis targeting chimera degraders

Future Medicinal Chemistry ◽

10.4155/fmc-2020-0073 ◽

2020 ◽

Author(s):

Xingui Liu ◽

Xuan Zhang ◽

Dongwen Lv ◽

Yaxia Yuan ◽

Guangrong Zheng ◽

...

Keyword(s):

Drug Discovery ◽

Protein Degradation ◽

Small Molecules ◽

Rational Design ◽

Degradation Pathway ◽

New Techniques ◽

Design And Optimization ◽

Structure Activity ◽

Ubiquitin Proteasome ◽

Complicated Mechanism

Targeted protein degradation by small-molecule degraders represents an emerging mode of action in drug discovery. Proteolysis targeting chimeras (PROTACs) are small molecules that can recruit an E3 ligase and a protein of interest (POI) into proximity, leading to induced ubiquitination and degradation of the POI by the proteasome system. To date, the design and optimization of PROTACs remain empirical due to the complicated mechanism of induced protein degradation. Nevertheless, it is increasingly appreciated that profiling step-by-step along the ubiquitin-proteasome degradation pathway using biochemical and biophysical assays are essential in understanding the structure–activity relationship and facilitating the rational design of PROTACs. This review aims to summarize these assays and to discuss the potential of expanding the toolbox with other new techniques.

Download Full-text