In the report by Gregor Heinrich "Parameter estimation for text analysis", when using Gibbs Sampling technique, the topic z is sampled according to a multinomial distribution, i.e., full conditional p(zi=k|z-i, w). As far as I know, p(w|z) and p(z|d) are multinomial distributions.Why the full conditional follows multinomial distribution? It would be much appreciated if someone can give me some explanation.

asked Jul 04 '13 at 22:27

leeshenli's gravatar image

leeshenli
1222


One Answer:

z is a discrete variable which can't take too many values, so sampling from it is easy: just evaluate the unnormalized score of each possible value, normalize, and sample from the normalized distribution. This has nothing todo with wether p(w|z) is a multinomial or not; it could be a gaussian, for example, and you'd still sample z accordingly.

The reason why continuous variables are harder is that in that case it is not often possible to compute the normalizing constant analytically (which we can do with z because normalizing a discrete vector is easy), and hence you need to invoke conjugacy to argue that sampling is efficient.

answered Jul 05 '13 at 07:29

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Alexandre, thank you so much.

(Jul 05 '13 at 09:52) leeshenli
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.