Warning
Sorry, this note is under construction. Feel free to take a look at what Iโve got so far, and please come back later!
Restricted Boltzmann Machines (RBMs) are latent ๐ค Boltzmann Machines with slightly different weight connections. Rather than connecting between all possible pairs of unit, we only connect between visible units
Note
This structure, much like a ๐ธ๏ธ Multilayer Perceptron, can be stacked on top of each other with additional latent unit layers to model increasingly complex probability distributions. This gives us Deep Boltzmann Machine (DBM), and with a slight modification, the ๐ Deep Belief Network.
Our energy function is defined as
and the probability distribution it models is
Conditional Distributions
Although the joint is intractable due to the partition term
Note that this can be factored into a product of probabilities, one for each hidden unit
where
This, the full conditional is
where we use
A similar derivation also gives us the reverse,
Since we have the conditionals, with each visible unit independent of the rest (and same for hidden units), we can optimize our distribution using ๐ Contrastive Divergence.