|
Hello, I feel like I understand Gibbs Sampling as it relates to topic models such as LDA quite well -- as far as learning the initial model. We get a number of counts based on iteratively building a distribution for each p(z_i=k | ...) and then on each iteration we sample from that distribution to get a new value for the current word's topic assignment. I.e. let's say there are K=3 topics and we have p(z_i=1) = 0.3, p(z_i=2) = 0.5, p(z_i=3) = 0.2 and we randomly sample this distribution and this time happen to get z_i = 1 so we set that word's topic to 1. Over time things average out and we can then at the end compute our distributions theta and phi along the lines of phi_{w,k} = p(w="word" | k=1) = (n_{w,k} + beta) / (n_{k} + beta*W). So far so good. Now, however I have two questions about performing inference on a new, unseen document given the model that we've just learned.
Thanks so much and I hope I was clear with my questions. |
|
For 1, just don't increment/decrement the phi counts; instead treating those as fixed probabilities. For 2, there is not necessarily a single topic each word is assigned to in the posterior distribution; most often the posterior for the z for each word has almost all of its mass over a small number of topics, but rarely just one. Hence, there is not true one topic for each word. Most people, however, is they need such a thing, just take the topic assigned to that word in the last model sample. Original poster should also see your more detailed answer to essentially the same question as #2 given here: http://metaoptimize.com/qa/questions/2960/how-can-i-get-topic-assignment-for-each-word-in-lda#2961 I think the last 2 approaches from that answer (an averaging approach) are much more sound than simply using the last topic assignment
(Mar 29 '11 at 23:06)
Will Darling
|