|
Hello everyone, In the Book Probabilistic Graphical Models, by Koller and Friedman. Page 513, The present the way to obtain samples from a Markov network, using the Markov Blanket and a Gibbs sampling over it. They defined the different steps of the sampling as Kernels. My question is more pragmatic, in Eq. 12.23, they define the conditional probability: p(x'|x_-i)=P(x',x_-i)/[sum over x'' P(x'',x_-i)] Why do they define the marginal probability (denominator) over x'', and what is x'', is it the initial value of the probability (before starting the sampling iteration)? Thanks Leon |
|
With input from Oscar Tackstrom. The denominator is there to make sure the distribution defined sums to one. As it is exactly the same for all x', you shouldn't compute it as above, but rather first compute P(x',x_-i) for all x' and then normalize (or, more likely, compute log P(x',x_-i) and then exp-normalize it). When you normalize you'll effectively divide everything by the sum over x'' which you mentioned above. Alexandre, I'm having some issues rendering MO in chrome, it looks all text without images nor format
(Oct 17 '11 at 12:11)
Leon Palafox
@Leon: Me too. It's happened before, but then the problem went away after a couple of hours.
(Oct 17 '11 at 14:26)
Oscar Täckström
|
I don't have the book in front of me, but isn't the x'' simply used so that it's not confused with the x' in the numerator? x'' then would take all possible values of the variable.
Oscar, this is the right answer, so you should write it as an answer and Leon should accept it. Rephrasing, the denominator is there to make sure the distribution defined sums to one. As it is exactly the same for all x', you shouldn't compute it as above, but rather first compute P(x'|x_-i) for all x' and then normalize (or, more likely, compute log P(x'|rest) and then exp-normalize it).
Ok, it seemed odd since some books use the same notation for both the marginal and the unnormalized posterior
@Alexandre, you mean to first compute P(x',x_-i) for all x', right? Perhaps it's this confusion of P(x|y) vs P(x,y) that Leon is referring to? Your answer is better than my original one though, so why don't you put it in the box? :)