Average Contrastive Divergence for Training Restricted Boltzmann Machines

Abstract: The main contribution of this paper is to introduce a new iterative training algorithm for restricted Boltzmann machines. The proposed learning path is inspired from online sequential extreme learning machine one of extreme learning machine variants which deals with time accumulated sequences of data with fixed or varied sizes. Recursive least squares rules are integrated for weights adaptation to avoid learning rate tuning and local minimum issues. The proposed approach is compared to one of the well known training algorithms for Boltzmann machines named “contrastive divergence”, in term of time, accuracy and algorithmic complexity under the same conditions. Results strongly encourage the new given rules during data reconstruction.

Download Full-text

Dynamical analysis of contrastive divergence learning: Restricted Boltzmann machines with Gaussian visible units

Neural Networks ◽

10.1016/j.neunet.2016.03.013 ◽

2016 ◽

Vol 79 ◽

pp. 78-87 ◽

Cited By ~ 18

Author(s):

Ryo Karakida ◽

Masato Okada ◽

Shun-ichi Amari

Keyword(s):

Dynamical Analysis ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

Contrastive Divergence

Download Full-text

Bounding the Bias of Contrastive Divergence Learning

Neural Computation ◽

10.1162/neco_a_00085 ◽

2011 ◽

Vol 23 (3) ◽

pp. 664-673 ◽

Cited By ~ 25

Author(s):

Asja Fischer ◽

Christian Igel

Keyword(s):

Gibbs Sampling ◽

Upper Bound ◽

Single Variable ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

Maximum Change ◽

Log Likelihood ◽

Contrastive Divergence ◽

The Absolute

Optimization based on k-step contrastive divergence (CD) has become a common way to train restricted Boltzmann machines (RBMs). The k-step CD is a biased estimator of the log-likelihood gradient relying on Gibbs sampling. We derive a new upper bound for this bias. Its magnitude depends on k, the number of variables in the RBM, and the maximum change in energy that can be produced by changing a single variable. The last reflects the dependence on the absolute values of the RBM parameters. The magnitude of the bias is also affected by the distance in variation between the modeled distribution and the starting distribution of the Gibbs chain.

Download Full-text

Convergence Analysis of Contrastive Divergence Algorithm Based on Gradient Method with Errors

Mathematical Problems in Engineering ◽

10.1155/2015/350102 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Xuesi Ma ◽

Xiaojie Wang

Keyword(s):

Finite Number ◽

Gibbs Sampling ◽

Gradient Method ◽

Convergence Theorem ◽

Learning Algorithm ◽

Restricted Boltzmann Machines ◽

Convergence Conditions ◽

Boltzmann Machines ◽

Contrastive Divergence ◽

Step Number

Contrastive Divergence has become a common way to train Restricted Boltzmann Machines; however, its convergence has not been made clear yet. This paper studies the convergence of Contrastive Divergence algorithm. We relate Contrastive Divergence algorithm to gradient method with errors and derive convergence conditions of Contrastive Divergence algorithm using the convergence theorem of gradient method with errors. We give specific convergence conditions of Contrastive Divergence learning algorithm for Restricted Boltzmann Machines in which both visible units and hidden units can only take a finite number of values. Two new convergence conditions are obtained by specifying the learning rate. Finally, we give specific conditions that the step number of Gibbs sampling must be satisfied in order to guarantee the Contrastive Divergence algorithm convergence.

Download Full-text

Analysis on Noisy Boltzmann Machines and Noisy Restricted Boltzmann Machines

IEEE Access ◽

10.1109/access.2021.3102275 ◽

2021 ◽

pp. 1-1

Author(s):

Wenhao Lu ◽

Chi-Sing Leung ◽

John Sum

Keyword(s):

Restricted Boltzmann Machines ◽

Boltzmann Machines

Download Full-text

Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Scientific Reports ◽

10.1038/s41598-021-82197-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Guanglei Xu ◽

William S. Oates

Keyword(s):

Neural Network ◽

Maximum Likelihood ◽

Image Reconstruction ◽

Image Recognition ◽

Shannon Entropy ◽

Reconstruction Error ◽

Likelihood Method ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

D Wave

AbstractRestricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ($$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.

Download Full-text