# An Efficient Learning Procedure for Deep Boltzmann Machines

@article{Salakhutdinov2012AnEL, title={An Efficient Learning Procedure for Deep Boltzmann Machines}, author={Ruslan Salakhutdinov and Geoffrey E. Hinton}, journal={Neural Computation}, year={2012}, volume={24}, pages={1967-2006} }

We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann machines with multiple hidden… Expand

#### Topics from this paper

#### 394 Citations

An Infinite Deep Boltzmann Machine

- Computer Science
- ICCDA 2018
- 2018

Experimental results indicate that iDBM can learn a generative and discriminative model as good as the original DBM, and has successfully eliminated the requirement of model selection for hidden layer sizes of DBMs. Expand

A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines

- Computer Science
- ICANN
- 2013

This paper shows empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm. Expand

Where Do Features Come From?

- Computer Science, Medicine
- Cogn. Sci.
- 2014

Using a stack of RBMs to initialize the weights of a feedforward neural network allows backpropagation to work effectively in much deeper networks and it leads to much better generalization. Expand

On the Training Algorithms for Restricted Boltzmann Machines

- Computer Science
- Anais Estendidos da Conference on Graphics, Patterns and Images (SIBGRAPI)
- 2019

The validation of the model is presented in the context of image reconstruction and unsupervised feature learning, and its main contributions are: temperature parameter introduction in DBM formulation, DBM using adaptive temperature, and DBM meta-parameter optimization through meta-heuristic techniques. Expand

Deep Restricted Boltzmann Networks

- Computer Science
- ArXiv
- 2016

A new method to compose RBMs to form a multi-layer network style architecture and a training method that trains all layers jointly that can generate descent images and outperform the normal RBM significantly in terms of image quality and feature quality, without losing much efficiency for training. Expand

Notes on Boltzmann Machines

- 2012

I. INTRODUCTION Boltzmann machines are probability distributions on high dimensional binary vectors which are analogous to Gaussian Markov Random Fields in that they are fully determined by first and… Expand

On the propriety of restricted Boltzmann machines

- Mathematics, Computer Science
- ArXiv
- 2016

The relationship between RBM parameter specification in the binary case and the tendency to undesirable model properties such as degeneracy, instability and uninterpretability are discussed. Expand

How to Pretrain Deep Boltzmann Machines in Two Stages

- Computer Science
- 2015

This paper shows empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm. Expand

Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines

- Computer Science
- AISTATS
- 2016

A pretraining algorithm, which is a layer-bylayer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented and it can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. Expand

Properties and Bayesian fitting of restricted Boltzmann machines

- Computer Science, Mathematics
- Stat. Anal. Data Min.
- 2019

The relationship between RBM parameter specification in the binary case and model properties such as degeneracy, instability and uninterpretability are discussed and the potential Bayes fitting of such (highly flexible) models are discussed. Expand

#### References

SHOWING 1-10 OF 72 REFERENCES

Deep Boltzmann Machines

- Computer Science
- AISTATS
- 2009

A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass. Expand

Efficient Learning of Deep Boltzmann Machines

- Mathematics, Computer Science
- AISTATS
- 2010

We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. The algorithm learns a separate “recognition” model that… Expand

A Fast Learning Algorithm for Deep Belief Nets

- Mathematics, Computer Science
- Neural Computation
- 2006

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. Expand

On the quantitative analysis of deep belief networks

- Mathematics, Computer Science
- ICML '08
- 2008

It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented. Expand

Implicit Mixtures of Restricted Boltzmann Machines

- Computer Science, Mathematics
- NIPS
- 2008

Results for the MNIST and NORB datasets are presented showing that the implicit mixture of RBMs learns clusters that reflect the class structure in the data. Expand

Greedy Layer-Wise Training of Deep Networks

- Computer Science
- NIPS
- 2006

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization. Expand

Connectionist Learning of Belief Networks

- Computer Science
- Artif. Intell.
- 1992

The “Gibbs sampling” simulation procedure for “sigmoid” and “noisy-OR” varieties of probabilistic belief networks can support maximum-likelihood learning from empirical data through local gradient ascent. Expand

Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine

- Computer Science
- NIPS
- 2010

This work uses the mean-covariance restricted Boltzmann machine (mcRBM) to learn features of speech data that serve as input into a standard DBN, and achieves a phone error rate superior to all published results on speaker-independent TIMIT to date. Expand

Learning Deep Boltzmann Machines using Adaptive MCMC

- Computer Science
- ICML
- 2010

This paper first shows a close connection between Fast PCD and adaptive MCMC, and develops a Coupled Adaptive Simulated Tempering algorithm that can be used to better explore a highly multimodal energy landscape. Expand

Learning Deep Architectures for AI

- Computer Science
- Found. Trends Mach. Learn.
- 2007

The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed. Expand