Warning

Sorry, this note is under construction. Feel free to take a look at what Iโ€™ve got so far, and please come back later!

Restricted Boltzmann Machines (RBMs) are latent ๐Ÿค– Boltzmann Machines with slightly different weight connections. Rather than connecting between all possible pairs of unit, we only connect between visible units and latent units , like in the figure below.

Note

This structure, much like a ๐Ÿ•ธ๏ธ Multilayer Perceptron, can be stacked on top of each other with additional latent unit layers to model increasingly complex probability distributions. This gives us Deep Boltzmann Machine (DBM), and with a slight modification, the ๐Ÿ•‹ Deep Belief Network.

Our energy function is defined as

and the probability distribution it models is

Conditional Distributions

Although the joint is intractable due to the partition term , the conditionals can be found. First,

Note that this can be factored into a product of probabilities, one for each hidden unit , and so we have for a single the probability

where is the th column of .

This, the full conditional is

where we use to evaluate to or , depending on .

A similar derivation also gives us the reverse,

Since we have the conditionals, with each visible unit independent of the rest (and same for hidden units), we can optimize our distribution using ๐Ÿ–– Contrastive Divergence.