Prediction of Molecular Properties Using Molecular Topographic Map

Atsushi Yoshimori

doi:10.3390/molecules26154475

Prediction of Molecular Properties Using Molecular Topographic Map

Molecules ◽

10.3390/molecules26154475 ◽

2021 ◽

Vol 26 (15) ◽

pp. 4475

Author(s):

Atsushi Yoshimori

Keyword(s):

Neural Networks ◽

Amino Acids ◽

Input Data ◽

Data Augmentation ◽

Critical Role ◽

Molecular Properties ◽

Rational Drug Design ◽

Topographic Map ◽

Discovery Research ◽

Drug Discovery Research

Prediction of molecular properties plays a critical role towards rational drug design. In this study, the Molecular Topographic Map (MTM) is proposed, which is a two-dimensional (2D) map that can be used to represent a molecule. An MTM is generated from the atomic features set of a molecule using generative topographic mapping and is then used as input data for analyzing structure-property/activity relationships. In the visualization and classification of 20 amino acids, differences of the amino acids can be visually confirmed from and revealed by hierarchical clustering with a similarity matrix of their MTMs. The prediction of molecular properties was performed on the basis of convolutional neural networks using MTMs as input data. The performance of the predictive models using MTM was found to be equal to or better than that using Morgan fingerprint or MACCS keys. Furthermore, data augmentation of MTMs using mixup has improved the prediction performance. Since molecules converted to MTMs can be treated like 2D images, they can be easily used with existing neural networks for image recognition and related technologies. MTM can be effectively utilized to predict molecular properties of small molecules to aid drug discovery research.

Download Full-text

Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

ACM Computing Surveys ◽

10.1145/3510413 ◽

2022 ◽

Author(s):

Claudio Filipi Gonçalves dos Santos ◽

João Paulo Papa

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Input Data ◽

Data Augmentation ◽

Critical Factor ◽

Regularization Methods ◽

Feature Maps ◽

The Neural Network ◽

Public Repositories

Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in at least one dataset by the time of their creation. A critical factor in training concerns the network’s regularization, which prevents the structure from overfitting. This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models. The works are classified into three main areas: the first one is called “data augmentation”, where all the techniques focus on performing changes in the input data. The second, named “internal changes”, which aims to describe procedures to modify the feature maps generated by the neural network or the kernels. The last one, called “label”, concerns transforming the labels of a given input. This work presents two main differences comparing to other available surveys about regularization: (i) the first concerns the papers gathered in the manuscript, which are not older than five years, and (ii) the second distinction is about reproducibility, i.e., all works refered here have their code available in public repositories or they have been directly implemented in some framework, such as TensorFlow or Torch.

Download Full-text

Deep neural networks trained with heavier data augmentation learn features closer to representations in hIT

10.32470/ccn.2018.1046-0 ◽

2018 ◽

Cited By ~ 1

Author(s):

Alex Hernández-García ◽

Johannes Mehrer ◽

Nikolaus Kriegeskorte ◽

Peter König ◽

Tim C. Kietzmann

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as attentional gain – an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.

Download Full-text

Epigenetic Target Prediction with Accurate Machine Learning Models

10.26434/chemrxiv.13522313 ◽

2021 ◽

Author(s):

Norberto Sánchez-Cruz ◽

Jose L. Medina-Franco

Keyword(s):

Machine Learning ◽

Small Molecules ◽

Predictive Models ◽

Large Scale ◽

Target Prediction ◽

Quantitative Measure ◽

Learning Models ◽

Discovery Research ◽

Drug Discovery Research ◽

Machine Learning Models

Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.

Download Full-text

Data augmentation for computed tomography angiography via synthetic image generation and neural domain adaptation

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-0015 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Malte Seemann ◽

Lennart Bargsten ◽

Alexander Schlaefer

Keyword(s):

Computed Tomography ◽

Neural Networks ◽

Deep Learning ◽

Medical Imaging ◽

Computed Tomography Angiography ◽

Data Augmentation ◽

Domain Adaptation ◽

Synthetic Image ◽

Wide Range ◽

The Impact

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.

Download Full-text

The Effectiveness of Data Augmentation for Melanoma Skin Cancer Prediction Using Convolutional Neural Networks

2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET) ◽

10.1109/iicaiet49801.2020.9257859 ◽

2020 ◽

Author(s):

Kin Wai Lee ◽

Renee Ka Yin Chin

Keyword(s):

Neural Networks ◽

Skin Cancer ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Cancer Prediction ◽

Melanoma Skin

Download Full-text

Data augmentation and semi-supervised learning for deep neural networks-based text classifier

Proceedings of the 35th Annual ACM Symposium on Applied Computing ◽

10.1145/3341105.3373992 ◽

2020 ◽

Author(s):

Heereen Shim ◽

Stijn Luca ◽

Dietwig Lowet ◽

Bart Vanrumste

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

Data Augmentation

Download Full-text

Data Augmentation Methods Applying Grayscale Images for Convolutional Neural Networks in Machine Vision

Applied Sciences ◽

10.3390/app11156721 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6721

Author(s):

Jinyeong Wang ◽

Sanghwan Lee

Keyword(s):

Neural Networks ◽

Machine Vision ◽

Object Detection ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Image Data ◽

Manufacturing Productivity ◽

Smart Factories ◽

Grayscale Images

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.

Download Full-text

Fuzz testing based data augmentation to improve robustness of deep neural networks

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering ◽

10.1145/3377811.3380415 ◽

2020 ◽

Cited By ~ 2

Author(s):

Xiang Gao ◽

Ripon K. Saha ◽

Mukul R. Prasad ◽

Abhik Roychoudhury

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation ◽

Fuzz Testing

Download Full-text

Synthesis of an azadioxa-planar triphenylborane and investigation of its structural and photophysical properties

Chemical Communications ◽

10.1039/d0cc08331c ◽

2021 ◽

Author(s):

Y. Kitamoto ◽

K. Oda ◽

K. Ogino ◽

K. Hiyama ◽

H. Kita ◽

...

Keyword(s):

Photophysical Properties ◽

Critical Role ◽

Molecular Properties ◽

Bridging Groups ◽

First Time

An azadioxa-planar triphenylborane was synthesized for the first time and it was found that bridging groups have a critical role in changing its molecular properties.

Download Full-text