Note on EM algorithm and likelihood lower bound
Motivation How to understand EM algorithm from a theoretical perspective? This post tries to understand EM as a form of alternative ascent of a lower bound of likelihood.
The Key Trick of EM The key trick we need to remember is the usage of Jensen Inequality on logarithm. So we could swap Expectation and logarithm and obtain a lower bound on likelihood. Generally, we have such inequality, given a positive function $q(z)$ that sums to $1$ (probability density),
May 13, 2022