Global optimization for neural network training

Artificial Neural Networks have earned popularity in recent years because of their ability to approximate nonlinear functions. Training a neural network involves minimizing the mean square error between the target and network output. The error surface is nonconvex and highly multimodal. Finding the minimum of a multimodal function is a NP complete problem and cannot be solved completely. Thus application of heuristic global optimization algorithms that computes a good global minimum to neural network training is of interest. This paper reviews the various heuristic global optimization algorithms used for training feedforward neural networks and recurrent neural networks. The training algorithms are compared in terms of the learning rate, convergence speed and accuracy of the output produced by the neural network. The paper concludes by suggesting directions for novel ANN training algorithms based on recent advances in global optimization.

Download Full-text

Dynamic learning rate neural network training and composite structural damage detection

AIAA Journal ◽

10.2514/3.13701 ◽

1997 ◽

Vol 35 ◽

pp. 1522-1527

Author(s):

H. Luo ◽

S. Hanagud

Keyword(s):

Neural Network ◽

Damage Detection ◽

Structural Damage ◽

Learning Rate ◽

Neural Network Training ◽

Structural Damage Detection ◽

Dynamic Learning ◽

Network Training

Download Full-text

Weight regularisation in particle swarm optimisation neural network training

2014 IEEE Symposium on Swarm Intelligence ◽

10.1109/sis.2014.7011773 ◽

2014 ◽

Cited By ~ 7

Author(s):

Anna Rakitianskaia ◽

Andries Engelbrecht

Keyword(s):

Neural Network ◽

Particle Swarm ◽

Particle Swarm Optimisation ◽

Neural Network Training ◽

Network Training

Download Full-text

A Geometric Perspective on Information Plane Analysis

Entropy ◽

10.3390/e23060711 ◽

2021 ◽

Vol 23 (6) ◽

pp. 711

Author(s):

Mina Basirat ◽

Bernhard C. Geiger ◽

Peter M. Roth

Keyword(s):

Neural Network ◽

Mutual Information ◽

Geometric Interpretation ◽

Neural Network Training ◽

Neural Network Learning ◽

Network Learning ◽

Plane Analysis ◽

Network Training ◽

Hidden Layer ◽

The Impact

Information plane analysis, describing the mutual information between the input and a hidden layer and between a hidden layer and the target over time, has recently been proposed to analyze the training of neural networks. Since the activations of a hidden layer are typically continuous-valued, this mutual information cannot be computed analytically and must thus be estimated, resulting in apparently inconsistent or even contradicting results in the literature. The goal of this paper is to demonstrate how information plane analysis can still be a valuable tool for analyzing neural network training. To this end, we complement the prevailing binning estimator for mutual information with a geometric interpretation. With this geometric interpretation in mind, we evaluate the impact of regularization and interpret phenomena such as underfitting and overfitting. In addition, we investigate neural network learning in the presence of noisy data and noisy labels.

Download Full-text