Suppose we have a simple Bayes Network. One child and multiple parents, each element in the network is defined by a Bernoulli distribution over its parent.

The network is defined by a complete CDP table (table of conditional probabilities), basically a binary table. Since it is full, we have no unobserved values. (I'm guessing a small number of parents)

How do we define the prior over the parents?

Is it fair to assume a likelihood minimization prior?, that is : Sum(I{Pa_i=1})sum{Pa_i} , where I is the indicator function (in essence the fraction of times the Pa_1 had 1 in the observations)

Or if we want to work on a fully Bayesian environment for prediction, should we disregard this priors and use conjugates over the Bernoulli? (symmetrical Dirichlets of order 2?)

Thanks a lot

asked Aug 21 '11 at 00:19

Leon%20Palafox's gravatar image

Leon Palafox
31265471107


One Answer:

What do you mean by priors here? Is it each entry of the CPT or a prior over all entries of the CPT?

Re 1, as a rule you shouldn't set the prior to minimize the likelihood of the data as this leads to all sorts of failure modes (such as 0 probabilities in your case). Re 2, you should use conjugate priors if you want ease of inference, yes, but you don't need to use conjugate priors to be fully bayesian (for example, you can always slice-sample things from a nonconjugate prior).

answered Aug 21 '11 at 08:59

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1896744214334

I meant the priors over each of the parents, like the model we have when we use naive Bayes. N parents for a single child. So even if I have previous information of the parents, I should not set the priors using that information?

(Aug 21 '11 at 11:48) Leon Palafox
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.