Active Learning with Statistical Models

Journal of Artificial Intelligence Research ◽

10.1613/jair.295 ◽

1996 ◽

Vol 4 ◽

pp. 129-145 ◽

Cited By ~ 645

Author(s):

D. A. Cohn ◽

Z. Ghahramani ◽

M. I. Jordan

Keyword(s):

Neural Networks ◽

Optimality Criterion ◽

Feedforward Neural Networks ◽

Machine Learning Algorithms ◽

Training Data ◽

Weighted Regression ◽

Locally Weighted Regression ◽

Training Examples ◽

Optimal Data Selection ◽

Learning Architectures

For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.

Download Full-text

Convolutional Neural Network

10.4018/978-1-6684-2408-7.ch077 ◽

2022 ◽

pp. 1559-1575

Author(s):

Mário Pereira Véstias

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Artificial Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Model ◽

Artificial Neural

Machine learning is the study of algorithms and models for computing systems to do tasks based on pattern identification and inference. When it is difficult or infeasible to develop an algorithm to do a particular task, machine learning algorithms can provide an output based on previous training data. A well-known machine learning model is deep learning. The most recent deep learning models are based on artificial neural networks (ANN). There exist several types of artificial neural networks including the feedforward neural network, the Kohonen self-organizing neural network, the recurrent neural network, the convolutional neural network, the modular neural network, among others. This article focuses on convolutional neural networks with a description of the model, the training and inference processes and its applicability. It will also give an overview of the most used CNN models and what to expect from the next generation of CNN models.

Download Full-text

Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities

Bioinformatics ◽

10.1093/bioinformatics/btz339 ◽

2019 ◽

Vol 35 (14) ◽

pp. i269-i277 ◽

Cited By ~ 24

Author(s):

Ameni Trabelsi ◽

Mohamed Chaabane ◽

Asa Ben-Hur

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Rna Binding ◽

Binding Specificity ◽

Training Data ◽

Supplementary Information ◽

Recurrent Networks ◽

Dna And Rna ◽

Learning Architectures ◽

Dna And Rna Binding

Abstract Motivation Deep learning architectures have recently demonstrated their power in predicting DNA- and RNA-binding specificity. Existing methods fall into three classes: Some are based on convolutional neural networks (CNNs), others use recurrent neural networks (RNNs) and others rely on hybrid architectures combining CNNs and RNNs. However, based on existing studies the relative merit of the various architectures remains unclear. Results In this study we present a systematic exploration of deep learning architectures for predicting DNA- and RNA-binding specificity. For this purpose, we present deepRAM, an end-to-end deep learning tool that provides an implementation of a wide selection of architectures; its fully automatic model selection procedure allows us to perform a fair and unbiased comparison of deep learning architectures. We find that deeper more complex architectures provide a clear advantage with sufficient training data, and that hybrid CNN/RNN architectures outperform other methods in terms of accuracy. Our work provides guidelines that can assist the practitioner in choosing an appropriate network architecture, and provides insight on the difference between the models learned by convolutional and recurrent networks. In particular, we find that although recurrent networks improve model accuracy, this comes at the expense of a loss in the interpretability of the features learned by the model. Availability and implementation The source code for deepRAM is available at https://github.com/MedChaabane/deepRAM. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Investigating Deep Feedforward Neural Networks for Classification of Transposon-Derived piRNAs

10.1101/2020.04.08.032755 ◽

2020 ◽

Author(s):

Alisson Hayasi da Costa ◽

Renato Augusto C. dos Santos ◽

Ricardo Cerri

Keyword(s):

Neural Networks ◽

Deep Learning ◽

State Of The Art ◽

Feedforward Neural Networks ◽

The State ◽

Machine Learning Algorithms ◽

Support Vector ◽

Advantages And Disadvantages ◽

Large Application

AbstractPIWI-Interacting RNAs (piRNAs) form an important class of non-coding RNAs that play a key role in the genome integrity through the silencing of transposable elements. However, despite their importance and the large application of deep learning in computational biology for classification tasks, there are few studies of deep learning and neural networks for piRNAs prediction. Therefore, this paper presents an investigation on deep feedforward networks models for classification of transposon-derived piRNAs. We analyze and compare the results of the neural networks in different hyperparameters choices, such as number of layers, activation functions and optimizers, clarifying the advantages and disadvantages of each configuration. From this analysis, we propose a model for human piRNAs classification and compare our method with the state-of-the-art deep neural network for piRNA prediction in the literature and also traditional machine learning algorithms, such as Support Vector Machines and Random Forests, showing that our model has achieved a great performance with an F-measure value of 0.872, outperforming the state-of-the-art method in the literature.

Download Full-text

Feedforward Neural Networks Applied to Problems in Ocean Engineering

Volume 2: Ocean Engineering and Polar and Arctic Sciences and Technology ◽

10.1115/omae2006-92468 ◽

2006 ◽

Author(s):

David E. Hess ◽

William E. Faller ◽

Robert F. Roddy ◽

Anne M. Pence ◽

Thomas C. Fu

Keyword(s):

Experimental Data ◽

Neural Networks ◽

Parameter Space ◽

Feedforward Neural Networks ◽

Design Guidelines ◽

Training Data ◽

Ocean Engineering ◽

Nonlinear Method ◽

Wave Impact ◽

Thrust And Torque

The Maneuvering and Control Division of the Naval Surface Warfare Center, Carderock Div. (NSWCCD) along with Applied Simulation Technologies have been developing and applying feedforward neural networks (FFNN) to problems of naval interest in Ocean Engineering. A selection of these will be discussed. Together, they show the power of the nonlinear method as well as its utility in diverse applications. Experimental data describing a subset of the B-Screw series of propellers operating in all four quadrants have been reported by MARIN in the Netherlands. The data contain varying pitch to diameter ratios, expanded area ratios, number of blades and advance angle. These four variables were used to train a FFNN to predict the four-quadrant thrust and torque characteristics for the entire B-screw series over a range of beta from 0 to 360 deg. The results show excellent agreement with the existing data and provide a means for estimating 4-quadrant performance for the entire series. For submarine simulation and design, knowledge of the total forces and moments acting on the hull as a function of angle-of-attack, sideslip angle and dimensionless turning rate across a large parameter space is required. This data is acquired experimentally and/or numerically and can be used to train a FFNN to act as a Virtual Tow Tank or Virtual CFD Code. The network not only recovers the training data but also serves as a very fast, nonlinear six degree-of-freedom look-up table of the forces and moments acting on the hull throughout the parameter space described by the vehicle dynamics. Example solutions demonstrating this approach will be presented. Wave impact loads pose continuing problems for vessels in high sea states, with damage to hatches and appendages, suggesting that these loads may be greater than current design guidelines. Such forcing is complex and often difficult to estimate numerically. Experimental data were acquired at NSWC to measure the hydrodynamic loads of regular, nonbreaking waves on a plate and a cylinder while varying incident wave height, wavelength, wave steepness, plate angle and immersion level of the plate/cylinder. Predictions of wave impact forces from a FFNN trained on the experimental data will be presented.

Download Full-text

Creep Rupture Forecasting

International Journal of Monitoring and Surveillance Technologies Research ◽

10.4018/ijmstr.2014040101 ◽

2014 ◽

Vol 2 (2) ◽

pp. 1-25 ◽

Cited By ~ 2

Author(s):

Stylianos Chatzidakis ◽

Miltiadis Alamaniotis ◽

Lefteri H. Tsoukalas

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Process Model ◽

Production Systems ◽

Learning Algorithms ◽

Difficult Problem ◽

Network Size ◽

Creep Rupture ◽

Machine Learning Algorithms ◽

Training Data

Creep rupture is becoming increasingly one of the most important problems affecting behavior and performance of power production systems operating in high temperature environments and potentially under irradiation as is the case of nuclear reactors. Creep rupture forecasting and estimation of the useful life is required to avoid unanticipated component failure and cost ineffective operation. Despite the rigorous investigations of creep mechanisms and their effect on component lifetime, experimental data are sparse rendering the time to rupture prediction a rather difficult problem. An approach for performing creep rupture forecasting that exploits the unique characteristics of machine learning algorithms is proposed herein. The approach seeks to introduce a mechanism that will synergistically exploit recent findings in creep rupture with the state-of-the-art computational paradigm of machine learning. In this study, three machine learning algorithms, namely General Regression Neural Networks, Artificial Neural Networks and Gaussian Processes, were employed to capture the underlying trends and provide creep rupture forecasting. The current implementation is demonstrated and evaluated on actual experimental creep rupture data. Results show that the Gaussian process model based on the Matérn kernel achieved the best overall prediction performance (56.38%). Significant dependencies exist on the number of training data, neural network size, kernel selection and whether interpolation or extrapolation is performed.

Download Full-text

Long-Time Predictive Modeling of Nonlinear Dynamical Systems Using Neural Networks

Complexity ◽

10.1155/2018/4801012 ◽

2018 ◽

Vol 2018 ◽

pp. 1-26 ◽

Cited By ~ 18

Author(s):

Shaowu Pan ◽

Karthik Duraisamy

Keyword(s):

Neural Networks ◽

Dynamical Systems ◽

Polynomial Regression ◽

Nonlinear Dynamical Systems ◽

Feedforward Neural Networks ◽

Data Availability ◽

Training Data ◽

Local Error ◽

Nonlinear Dynamical ◽

Sparse Polynomial

We study the use of feedforward neural networks (FNN) to develop models of nonlinear dynamical systems from data. Emphasis is placed on predictions at long times, with limited data availability. Inspired by global stability analysis, and the observation of strong correlation between the local error and the maximal singular value of the Jacobian of the ANN, we introduce Jacobian regularization in the loss function. This regularization suppresses the sensitivity of the prediction to the local error and is shown to improve accuracy and robustness. Comparison between the proposed approach and sparse polynomial regression is presented in numerical examples ranging from simple ODE systems to nonlinear PDE systems including vortex shedding behind a cylinder and instability-driven buoyant mixing flow. Furthermore, limitations of feedforward neural networks are highlighted, especially when the training data does not include a low dimensional attractor. Strategies of data augmentation are presented as remedies to address these issues to a certain extent.

Download Full-text

Performance of Machine Learning Algorithms and Diversity in Data

MATEC Web of Conferences ◽

10.1051/matecconf/201821004019 ◽

2018 ◽

Vol 210 ◽

pp. 04019 ◽

Cited By ~ 1

Author(s):

Hyontai SUG

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Real World ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Real World Data ◽

Random Data ◽

Data Set ◽

World Data

Recent world events in go games between human and artificial intelligence called AlphaGo showed the big advancement in machine learning technologies. While AlphaGo was trained using real world data, AlphaGo Zero was trained using massive random data, and the fact that AlphaGo Zero won AlphaGo completely revealed that diversity and size in training data is important for better performance for the machine learning algorithms, especially in deep learning algorithms of neural networks. On the other hand, artificial neural networks and decision trees are widely accepted machine learning algorithms because of their robustness in errors and comprehensibility respectively. In this paper in order to prove that diversity and size in data are important factors for better performance of machine learning algorithms empirically, the two representative algorithms are used for experiment. A real world data set called breast tissue was chosen, because the data set consists of real numbers that is very good property for artificial random data generation. The result of the experiment proved the fact that the diversity and size of data are very important factors for better performance.

Download Full-text

A SELECTIVE LEARNING METHOD TO IMPROVE THE GENERALIZATION OF MULTILAYER FEEDFORWARD NEURAL NETWORKS

International Journal of Neural Systems ◽

10.1142/s0129065701000588 ◽

2001 ◽

Vol 11 (02) ◽

pp. 167-177 ◽

Cited By ~ 8

Author(s):

I. M. GALVÁN ◽

P. ISASI ◽

R. ALER ◽

J. M. VALLS

Keyword(s):

Neural Networks ◽

Learning Strategy ◽

Time Series Prediction ◽

Feedforward Neural Networks ◽

Training Data ◽

The Novel ◽

Learning Method ◽

Data Set ◽

Dynamic Selection

Multilayer feedforward neural networks with backpropagation algorithm have been used successfully in many applications. However, the level of generalization is heavily dependent on the quality of the training data. That is, some of the training patterns can be redundant or irrelevant. It has been shown that with careful dynamic selection of training patterns, better generalization performance may be obtained. Nevertheless, generalization is carried out independently of the novel patterns to be approximated. In this paper, we present a learning method that automatically selects the training patterns more appropriate to the new sample to be predicted. This training method follows a lazy learning strategy, in the sense that it builds approximations centered around the novel sample. The proposed method has been applied to three different domains: two artificial approximation problems and a real time series prediction problem. Results have been compared to standard backpropagation using the complete training data set and the new method shows better generalization abilities.

Download Full-text

Stable Prediction with Model Misspecification and Agnostic Distribution Shift

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5876 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4485-4492

Author(s):

Kun Kuang ◽

Ruoxuan Xiong ◽

Peng Cui ◽

Susan Athey ◽

Bo Li

Keyword(s):

Machine Learning ◽

Parameter Estimation ◽

Test Data ◽

Model Misspecification ◽

Machine Learning Algorithms ◽

The Other ◽

Training Data ◽

Weighted Regression ◽

True Model ◽

The Stability

For many machine learning algorithms, two main assumptions are required to guarantee performance. One is that the test data are drawn from the same distribution as the training data, and the other is that the model is correctly specified. In real applications, however, we often have little prior knowledge on the test data and on the underlying true model. Under model misspecification, agnostic distribution shift between training and test data leads to inaccuracy of parameter estimation and instability of prediction across unknown test data. To address these problems, we propose a novel Decorrelated Weighting Regression (DWR) algorithm which jointly optimizes a variable decorrelation regularizer and a weighted regression model. The variable decorrelation regularizer estimates a weight for each sample such that variables are decorrelated on the weighted training data. Then, these weights are used in the weighted regression to improve the accuracy of estimation on the effect of each variable, thus help to improve the stability of prediction across unknown test data. Extensive experiments clearly demonstrate that our DWR algorithm can significantly improve the accuracy of parameter estimation and stability of prediction with model misspecification and agnostic distribution shift.

Download Full-text

Comparison of Word and Character Level Information for Medical Term Identification Using Convolutional Neural Networks and Transformers

10.3233/shti210717 ◽

2021 ◽

Author(s):

Sandaru Seneviratne ◽

Artem Lenskiy ◽

Christopher Nolan ◽

Eleni Daskalaki ◽

Hanna Suominen

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Statistical Significance ◽

Training Data ◽

Next Of Kin ◽

General Terms ◽

Medical Term ◽

Level Information ◽

Learning Architectures ◽

Term Identification

Complexity and domain-specificity make medical text hard to understand for patients and their next of kin. To simplify such text, this paper explored how word and character level information can be leveraged to identify medical terms when training data is limited. We created a dataset of medical and general terms using the Human Disease Ontology from BioPortal and Wikipedia pages. Our results from 10-fold cross validation indicated that convolutional neural networks (CNNs) and transformers perform competitively. The best F score of 93.9% was achieved by a CNN trained on both word and character level embeddings. Statistical significance tests demonstrated that general word embeddings provide rich word representations for medical term identification. Consequently, focusing on words is favorable for medical term identification if using deep learning architectures.

Download Full-text