recently ,i read a paper "On Smooothing and Inference for Topic Models", here, the authors use Perplexity to assess different parameter estimation algorithms, e.g. CGS,CVB, ML,MAP and VB. Is Perplexity the standard or unique criterion in comparison? why not the speed of convergence,time complexity etc.

asked Aug 25 '11 at 10:20

yinwenpeng's gravatar image

yinwenpeng
16113


One Answer:

Perplexity is a measure of the quality of the model learned by LDA in predicting future data from the same distribution as the data used to train the model. In doing so, it measures an interesting characteristic of an inference algorithm: given that the model is the same, the best algorithm (in terms of quality of the learned result) will have better perplexity than the others.

However, perplexity is known to be a bad measure for comparing different topic models if what you want is interpretability by humans. See Reading Tea Leaves, by Chang et al, for a discussion of this using user experiments, and see Optimizing semantic coherence in topic models by Mimno et al, for a discussion of another metric that is shown to correlate better with human judgments of quality. Perplexity is also not the ideal metric if what you want is to look at the quality of the learned topics (as in, how well do they fit the data), as this can be done more elegantly and in a way that is easier to interprete by doing bayesian posterior checking, as in Bayesian checking for topic models by Mimno and Blei. Perplexity also completely ignores computation time and iterations until convergence, which is why the paper you cited provides detailed plots of convergence per iteration and discusses the expected cost per iteration for different methods.

So, to summarize, perplexity is usually the first or second metric used to judge statistical model quality (other popular methods being test-set likelihood or even marginal probability of the data given the model), but it is too coarse, hence recently the topic modelling community has been moving towads more accurate metrics. Even though these more accurate metrics carry a lot more weight and show you all sorts of interesting information, bear in mind that test-set perplexity is probably correlated with all of them.

answered Aug 25 '11 at 13:02

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1896744214334

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.