Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning

Sudheendra Vijayanarasimhan; Prateek Jain; Kristen Grauman

doi:10.1109/tpami.2013.121

Reaction-based Enumeration, Active Learning, and Free Energy Calculations to Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin Dependent Kinase 2 Inhibitors

10.26434/chemrxiv.7841270.v2 ◽

2019 ◽

Author(s):

Kyle Konze ◽

Pieter Bos ◽

Markus Dahlgren ◽

Karl Leswing ◽

Ivan Tubert-Brohman ◽

...

Keyword(s):

Free Energy ◽

Drug Discovery ◽

Active Learning ◽

Large Scale ◽

Chemical Space ◽

Population Based ◽

Free Energy Calculations ◽

Computational Technique ◽

Cyclin Dependent Kinase ◽

Energy Calculations

We report a new computational technique, PathFinder, that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. Coupling PathFinder with active learning and cloud-based free energy calculations allows for large-scale potency predictions of compounds on a timescale that impacts drug discovery. The process is further accelerated by using a combination of population-based statistics and active learning techniques. Using this approach, we rapidly optimized R-groups and core hops for inhibitors of cyclin-dependent kinase 2. We explored greater than 300 thousand ideas and identified 35 ligands with diverse commercially available R-groups and a predicted IC50 < 100 nM, and four unique cores with a predicted IC50 < 100 nM. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.

Download Full-text

Reaction-based Enumeration, Active Learning, and Free Energy Calculations to Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin Dependent Kinase 2 Inhibitors

10.26434/chemrxiv.7841270 ◽

2019 ◽

Author(s):

Kyle Konze ◽

Pieter Bos ◽

Markus Dahlgren ◽

Karl Leswing ◽

Ivan Tubert-Brohman ◽

...

Keyword(s):

Free Energy ◽

Drug Discovery ◽

Active Learning ◽

Large Scale ◽

Chemical Space ◽

Population Based ◽

Free Energy Calculations ◽

Computational Technique ◽

Cyclin Dependent Kinase ◽

Energy Calculations

We report a new computational technique, PathFinder, that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. Coupling PathFinder with active learning and cloud-based free energy calculations allows for large-scale potency predictions of compounds on a timescale that impacts drug discovery. The process is further accelerated by using a combination of population-based statistics and active learning techniques. Using this approach, we rapidly optimized R-groups and core hops for inhibitors of cyclin-dependent kinase 2. We explored greater than 300 thousand ideas and identified 35 ligands with diverse commercially available R-groups and a predicted IC50 < 100 nM, and four unique cores with a predicted IC50 < 100 nM. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.

Download Full-text

Active learning and relevance vector machine in efficient estimate of basin stability for large-scale dynamic networks

Chaos An Interdisciplinary Journal of Nonlinear Science ◽

10.1063/5.0044899 ◽

2021 ◽

Vol 31 (5) ◽

pp. 053129

Author(s):

Yiming Che ◽

Changqing Cheng

Keyword(s):

Active Learning ◽

Large Scale ◽

Dynamic Networks ◽

Relevance Vector Machine ◽

Efficient Estimate ◽

Basin Stability

Download Full-text

Active learning with uncertainty sampling for large scale activity recognition in smart homes

Journal of Ambient Intelligence and Smart Environments ◽

10.3233/ais-170427 ◽

2017 ◽

Vol 9 (2) ◽

pp. 209-223 ◽

Cited By ~ 6

Author(s):

Hande Alemdar ◽

T.L.M. van Kasteren ◽

Cem Ersoy

Keyword(s):

Active Learning ◽

Activity Recognition ◽

Large Scale ◽

Smart Homes ◽

Uncertainty Sampling

Download Full-text

Bayesian active learning of interatomic force field for molecular dynamics simulation of Pt/Ag(111)

10.26434/chemrxiv-2021-sk6lf-v2 ◽

2021 ◽

Author(s):

Kai Xu ◽

Lei Yan ◽

Bingran You

Keyword(s):

Molecular Dynamics ◽

Active Learning ◽

Force Field ◽

Density Functional ◽

Process Model ◽

Large Scale ◽

Computational Cost ◽

Dynamics Simulation ◽

Potential Energy Landscape ◽

Three Body

Force field is a central requirement in molecular dynamics (MD) simulation for accurate description of the potential energy landscape and the time evolution of individual atomic motions. Most energy models are limited by a fundamental tradeoff between accuracy and speed. Although ab initio MD based on density functional theory (DFT) has high accuracy, its high computational cost prevents its use for large-scale and long-timescale simulations. Here, we use Bayesian active learning to construct a Gaussian process model of interatomic forces to describe Pt deposited on Ag(111). An accurate model is obtained within one day of wall time after selecting only 126 atomic environments based on two- and three-body interactions, providing mean absolute errors of 52 and 142 meV/Å for Ag and Pt, respectively. Our work highlights automated and minimalistic training of machine-learning force fields with high fidelity to DFT, which would enable large-scale and long-timescale simulations of alloy surfaces at first-principles accuracy.

Download Full-text

Combining Self-supervised Learning and Active Learning for Disfluency Detection

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3487290 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-25

Author(s):

Shaolei Wang ◽

Zhongyuan Wang ◽

Wanxiang Che ◽

Sendong Zhao ◽

Ting Liu

Keyword(s):

Neural Network ◽

Active Learning ◽

Supervised Learning ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Training Dataset ◽

Performance Gap ◽

Annotation Costs ◽

Trained Neural Network

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled data and propose two self-supervised pre-training tasks: (i) a tagging task to detect the added noisy words and (ii) sentence classification to distinguish original sentences from grammatically incorrect sentences. We then combine these two tasks to jointly pre-train a neural network. The pre-trained neural network is then fine-tuned using human-annotated disfluency detection training data. The self-supervised learning method can capture task-special knowledge for disfluency detection and achieve better performance when fine-tuning on a small annotated dataset compared to other supervised methods. However, limited in that the pseudo training data are generated based on simple heuristics and cannot fully cover all the disfluency patterns, there is still a performance gap compared to the supervised models trained on the full training dataset. We further explore how to bridge the performance gap by integrating active learning during the fine-tuning process. Active learning strives to reduce annotation costs by choosing the most critical examples to label and can address the weakness of self-supervised learning with a small annotated dataset. We show that by combining self-supervised learning with active learning, our model is able to match state-of-the-art performance with just about 10% of the original training data on both the commonly used English Switchboard test set and a set of in-house annotated Chinese data.

Download Full-text

Active Learning Plus Deep Learning Can Establish Cost-Effective and Robust Model for Multichannel Image: A Case on Hyperspectral Image Classification

Sensors ◽

10.3390/s20174975 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4975

Author(s):

Fangyu Shi ◽

Zhaodi Wang ◽

Menghan Hu ◽

Guangtao Zhai

Keyword(s):

Deep Learning ◽

Active Learning ◽

Image Classification ◽

Large Scale ◽

Hyperspectral Image ◽

Image Annotation ◽

Learning Algorithm ◽

Magnetic Resonance Images ◽

Biological Engineering ◽

Hyperspectral Image Classification

Relying on large scale labeled datasets, deep learning has achieved good performance in image classification tasks. In agricultural and biological engineering, image annotation is time-consuming and expensive. It also requires annotators to have technical skills in specific areas. Obtaining the ground truth is difficult because natural images are expensive. In addition, images in these areas are usually stored as multichannel images, such as computed tomography (CT) images, magnetic resonance images (MRI), and hyperspectral images (HSI). In this paper, we present a framework using active learning and deep learning for multichannel image classification. We use three active learning algorithms, including least confidence, margin sampling, and entropy, as the selection criteria. Based on this framework, we further introduce an “image pool” to make full advantage of images generated by data augmentation. To prove the availability of the proposed framework, we present a case study on agricultural hyperspectral image classification. The results show that the proposed framework achieves better performance compared with the deep learning model. Manual annotation of all the training sets achieves an encouraging accuracy. In comparison, using active learning algorithm of entropy and image pool achieves a similar accuracy with only part of the whole training set manually annotated. In practical application, the proposed framework can remarkably reduce labeling effort during the model development and upadting processes, and can be applied to multichannel image classification in agricultural and biological engineering.

Download Full-text

Large-Scale Image Classification Using Active Learning

IEEE Geoscience and Remote Sensing Letters ◽

10.1109/lgrs.2013.2255258 ◽

2014 ◽

Vol 11 (1) ◽

pp. 259-263 ◽

Cited By ~ 24

Author(s):

Naif Alajlan ◽

Edoardo Pasolli ◽

Farid Melgani ◽

Andrea Franzoso

Keyword(s):

Active Learning ◽

Image Classification ◽

Large Scale

Download Full-text

Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

10.18653/v1/d18-1318 ◽

2018 ◽

Cited By ~ 8

Author(s):

Aditya Siddhant ◽

Zachary C. Lipton

Keyword(s):

Natural Language Processing ◽

Active Learning ◽

Empirical Study ◽

Natural Language ◽

Language Processing ◽

Large Scale

Download Full-text

High Heels and High Expectations: Feminist Teaching in a Neoliberalist University

Kvinder Køn & Forskning ◽

10.7146/kkf.v25i1.97070 ◽

2017 ◽

Author(s):

Elina Penttinen ◽

Marjut Jyrkinen

Keyword(s):

Active Learning ◽

Large Scale ◽

Study Data ◽

Pedagogical Practices ◽

Leadership Roles ◽

High Expectations ◽

Course Content ◽

Construction Of Knowledge ◽

Mind Set

This study aims to examine the suitability of feminist student-centred active learning pedagogy in large-scale classroom settings in a contemporary neoliberalist university context. In the current individualist culture in the academia where students implicitly have adopted a customer-like mind-set, they need to be rational in terms of what they study and how they use their time. Weargue that feminist values are what makes student-centred active learning successful and will enhance the academic expertise of students. However, the values of inclusiveness, low-hierarchy, co-construction of knowledge, and empowerment of feminist pedagogy need to be revisited in the contemporary context. Low-hierarchy may signal to students that they have the ‘upper’hand. Instead of engaging actively in the classroom, they challenge the course content and pedagogical practices. On the basis of our case study data, we claim that this attitude is inherently gendered. Thus, paradoxically, teachers in feminist classrooms need to be careful about the role of ‘service provider’ and assume more assertive leadership roles in order to ensure successful learning outcomes.

Download Full-text