More than a question, this is a positive reinforcement for this particular topic.

As far as I understand, Belief Propagation is a technique to do inference on Graphical Models.

Doing this inference, we are able to assign probability distributions to previously unseen nodes.

  1. If we have a fully observed data set (and its conditional probability tables), BP allows us only to get the marginal distribution?

  2. If our set is fully observable (with missing CPT), BP allows us to calculate those missing probability tables?

  3. And in the case we have hidden variables, do we just treat them as unseen nodes and apply BP, and then calculate the optimal values for those unseen values? (Kinda like an EM)

Any feedback or further explanation will be greatly appreciated.

Leon

asked Oct 18 '11 at 04:03

Leon%20Palafox's gravatar image

Leon Palafox
31265471107

edited Oct 19 '11 at 04:59


2 Answers:

Adding to sbos's answers:

  1. Yes, exactly. BP computes the marginal distribution for each node.
  2. Not exactly. This would be done by some form of parameter estimation (i.e., maximum likelihood, sampling, regularized maximum likelihood, maximum margin, etc). Most of these will require that you run BP to get the normalizing constant for the gradient (but maximum margin will only require that you run max-product, not sum-product)
  3. I'm not sure what you mean by CPD, but to compute the likelihood in the case of latent variables you have to run BP twice: first over the latent variables to compute the expected score of the observed configuration and then over all variables to get the normalization constant. EM is an approximation to this that maximizes a lower-bound on this likelihood which is often easier to compute.

Of course, to be correct you'd have to run actual belief propagation over junction trees, not loopy BP, in the cases above.

answered Oct 18 '11 at 22:10

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1899744214335

CPD is BP (don't know want went wrong between my mind and my hands) sorry for the confusion, I'll change it

(Oct 19 '11 at 04:58) Leon Palafox

At first, BP does inference only in models without cycles (i.e. trees, chains).

  1. I'm not sure I understand your all of your questions, but BP get us not only the marginals, but also optimal configuration of hidden variables.
  2. What is conditional probability tables? Do you mean all the $p(y|x)$ probabilities?
  3. I also misunderstand what's the question about. BP is very similar to Viterbi algorithm, except that Viterbi handles only sequences and BP is for more general cases, so it might be useful for you to examine it first.

answered Oct 18 '11 at 11:46

Sergey%20Bartunov's gravatar image

Sergey Bartunov
317711

By optimal configuration you mean by scoring the different configurations? Yes the CPT are the probabilities p(y|x), is the way they define them in Scholler and Friedman

(Oct 19 '11 at 05:01) Leon Palafox
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.