EM is a general algorithm that trains latent variable models
For each iteration, we first estimate the hidden variables given the model, then estimate the model given the hidden variables. The formal procedure is as follows.
Given training data
- E-step estimates
, the probability of hidden variables being a specific value, to find the expected value of each hidden variable . - M-step estimates
using the probabilities of calculated above and similarly finds the MLE or MAP of the parameters.
At its core, EM is an instance of variational inference using the ๐งฌ Evidence Lower Bound
- E-step estimates
, an approximation for the actual posterior . This makes the ELBO tight to the true likelihood. - M-step maximizes
with respect to , holding fixed.