|
How to compare graphical models in terms of complexity? The question is actually not exactly "complexity" because I don't know whether it is a right term. Let me start from an example. You can two models, 1) p(x|z)p(y|z) 2) p(x|z)p(z|y) If z is hidden variable, x and y are two variables of interests. Besides empirical comparisons on real data, what can we say something about two models? For instance, we wish to compute E[x|y,z] and E[y|x,z], which one is easier to do so? Is there any literature for this kind of questions? |
|
I'm assuming you are talking about statistical complexity. Statistical complexity of the model tells you how many different observations can be explained by the model. A model that fits many possible worlds will also be sensitive to noise and need many observations. Simplest measure of statistical complexity is the number of adjustable parameters. More parameters = more complexity. In your example, your two models have equal number of parameters, and furthermore, they are identical. This is because both Bayesian Networks corresponding to your two models are moral, so they are equivalent to an undirected chain over 3 variables. On other hand, consider p(a|b,c)p(b)p(c)p(d) and p(a|b)p(b|c)p(c|d) For binary variables they have the same number of parameters, yet the models are not identical. To compare models in such situations you need a refinement of the "number of parameters" criterion, I recommend Dmitry Rusakov's thesis for more information on this problem Update: from computational standpoint, graphical model is computationally easy if it is "tree-like", ie
If you have special restrictions on potentials, planar graphical models are sometimes easy. Testing if a model is "tree-like" is NP-hard, so in general you can't tell when the model is not easy computationally. @Yaroslav, thanks!
(Dec 30 '10 at 10:54)
Liangjie Hong
|
|
In number 1 you have a directed tree, and in the second you have a directed chain. Basically the inference is the same in both cases (most likely the sum-product algorithm). In the sum-product algorithm you can calculate the max probability and expected value of all the members of your model quite easily. So basically there is no problem from the complexity point of view, since they are solved pretty much in the same way. Remember that a chain is basically a tree but with single parents/childs Thanks @Leon. Although in this case, it's simple to compare. We all know that real-world models are much more complicated. So, in general, what kind of measure do we use? Computational cost? Model complexity? Or something else?
(Dec 30 '10 at 02:19)
Liangjie Hong
A good way to compare graphical models would be using their computational cost, but using the sum-product algorithm it is basically up to how many nodes do you have. The most nodes the higher the cost. The complexity would be related with the inference algorithm which may be the same (depending on the model). Most algorithms offer you the possibility to get the probabilities of every node with one large computation. Try looking into Bishop's chapter on graphical models to get a better grasp on them.
(Dec 30 '10 at 02:27)
Leon Palafox ♦
@Leon, thanks a lot!
(Dec 30 '10 at 10:54)
Liangjie Hong
|
Do you mean computationally easier or statistically easier?
@Ryan, my question is computational easier or not.