|
In these weeks I have been studying the classical Latent Dirichlet Allocation (LDA) algorithm by David Blei and colleagues (2003), and the LDA variant based on Gibbs sampling introduced by Tom Griffiths. What are the main differences between the two methods? |
|
Variational Inference and Gibbs Sampling are quite different methods for inference. Variational inference is an optimization method (usually coordinate ascent) while Gibbs sampling is based on statistical simulation. In variational inference, we define a surrogate posterior distribution which has much simpler form (e.g. fully factorized) than the true posterior distribution. Then we optimize the simpler posterior distribution in terms of minimizing the KL divergence between the surrogate posterior and true posterior. Unfortunately, if the model is nonconjugate, it's gonna be a problem for us to get a closed form of update rules. In practice, the time for each iteration of variational inference is much longer than Gibbs sampling, but variational inference needs much less total iterations than Gibbs sampling. |
|
Ok, the main difference is that the first paper uses Variational Inference and the other one uses Gibbs Sampling. What does this means: Both refer to the inference process (the optimization process) that you use to actually find the values of the parameters. Usually the common sense idea is that Variational Inference is faster, and you will obtain faster results, while Gibbs Sampling will take longer to give you sensible results. If well implemented, you should obtain similar results. On the implementation side, usually variational inference methods are easier to debug. To be specific, the original paper is a Mean-field Variational approximation. The other paper is a collapsed gibbs sampler, where only the topic markers are sampled. This latter implementation is actually quite fast.
(Nov 15 '13 at 23:10)
zaxtax ♦
|