I'm studying LDA and have little background in Bayesian inference. Could you please tell me the process of how exactly a topic is assigned to a word in a corpus? Is it obtained by argmax P(z|w) where z is a topic, w is a word? I didn't find explanation in any papers. Perhaps, this is too trivial.

asked Oct 16 '10 at 08:04

Killua's gravatar image

Killua
716811


One Answer:

It depends fundamentally on the inference method you're using.

  • For variational inference (as described in the original Blei, Ng, and Jordan paper) each word in the corpus has a pseudo-distribution over topics, expressed by the mean field for that word. So you can't technically get the topic assignment for a word, but you can use the mean field to compute the most likely topic for that word and similar combinations.
  • If you're using Gibbs sampling, as per the Griffiths and Steyvers paper (or any more recent faster formulation, such as the implementation in MALLET), then at every iteration you have a topic assignment for every word in the corpus. You can then, extract "the topic" for a word in a few different ways:
  • just use the last sample's topics
  • count, over all samples, the most frequent topic for that word
  • use the empirical distribution of topics over words (by looking at all past samples) to compute a finer-grained representation, for example distinguishing words that have mass mostly in a single topic from words whose mass is more spread out over more topics

Most deterministic inference methods (collapsed variational, EM, etc) fit with variational above, and most stochastic ones (non-collapsed gibbs sampling, sampling for HDP-LDA, etc) fit with Gibbs sampling above.

answered Oct 16 '10 at 08:24

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1895244214333

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.