In these notes on Markov Chain Monte Carlo Sampling for Dirichlet Process Mixture Models, page 4, about the middle of the page, they mention that a clustering structure of data phi_1:N is represented by:

p(phi_1,....,phi_N)=p(phi_1)p(phi_2|phi_1).........p(phi_N|phi_(1:N-1))

I'm kind of missing how this represents a clustering structure.

Thanks

asked Apr 26 '11 at 07:24

Leon%20Palafox's gravatar image

Leon Palafox
31265471107

edited Apr 26 '11 at 07:49

ogrisel's gravatar image

ogrisel
398464480


One Answer:

This just says that the phi variables have a joint distribution and that the probability of the next phi depends on the probabilities of all the phis that came before. This in itself, as you noticed, does not force a clustering structure on the phis (they could still be independent). However, if you assume the functional form for P presented right above (that says that P(phi_i|phi_1:i-1) is either a draw from g0 or equal to a previously chosen example), then you have a clustering structure. The equation is just saying that all phi variables obey that structure.

answered Apr 26 '11 at 07:54

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1896744214334

Ahh ok, I did understand that part, I just was a bit puzzled, since in the text it seems as if that distribution is enough to ensure a clustering structure. Thanks

(Apr 26 '11 at 07:58) Leon Palafox
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.