|
In many papers on graphical models, directed graphical models and undirected graphical models are usually presented, and algorithms that operate on both, etc. But factor graphs are usually just mentioned in passing. Why is this so? It slightly bothers me that there are many different factor graphs for a given Markov random field, and that, to compute which factors in the MRF will be in the energy function is generally an NP-complete problem (listing all maximal cliques in a graph). Why not just directly specifing these models as factor graphs, and omit this onerous step of computing the terms of the energy function? Is it because most papers only deal with a specific model, and then can afford to precompute the factors, or am I missing something? |
|
I think two factors help explain why Markov random fields (MRF) are preferred over factor graphs:
So it's mostly due to history, then, plus simplicity (since an MRF will have a lot less nodes than its factor graph by definition, and a lot of the time the factors are obvious).
(Jul 08 '10 at 13:57)
Alexandre Passos ♦
2
This is my best judgment, anyways. As an aside, it was interesting to note that on the linked page advocating factor graphs, it seems that there may be more reason to use a variant of factor graphs in terms of their greater generality: B. J. Frey 2003 Extending factor graphs so as to unify directed and undirected graphical models, in Proceedings of Conference on Uncertainty in Artificial Intelligence 19 (UAI 03), Morgan Kaufmann, CA, Acapulco, Mexico, August
(Jul 08 '10 at 14:07)
John L Taylor
1
Yes, factor graphs are clearly more general. The simplest example I can think of is the triangle graph (a->b, b->c, c->a). It admits two different factor graph representations: (a->f, b->f, c->f) and (a->f1, a->f2, b->f1, b->f3, c->f2, c->f3), the first with a three-way interaction and the second with three pairwise interactions. These are clearly two different models. I don't think anybody advocating just specifying the MRF in this case, though.
(Jul 08 '10 at 14:14)
Alexandre Passos ♦
I meant to point-out the less trivial generality of Frey's notation for representing both directed and undirected graphs as factor graphs, effectively subsuming MRF and Bayes Nets (BN).
(Jul 08 '10 at 14:42)
John L Taylor
|
|
Markov Network and Factor Graph are just different levels of granularity when describing your probability distribution. Markov Network gives you all the independencies between variables, Factor Graph specifies which kinds of interactions are allowed within each maximal clique, and "set of features" gives you parameterization of entries in each of the factors. In practice your log-linear distribution family is specified by the set of features, and "Markov Network/Factor Graph" are higher level overviews. Given a set of features, you get a corresponding factor graph by creating a factor for any every set of variables that are simultaneous inputs to some feature function. Given a factor graph you find a corresponding Markov Network by connecting any two nodes that share a factor. As to why people even bother talking about their distribution family at the Markov Network level when the actual distribution they use is defined by some set of parameterized probability tables or feature functions, treewidth of Markov Network gives you the complexity of inference, and exact inference (Junction Tree algorithm) is formulated in terms of cliques/separators on the corresponding Markov Network |
I agree it's bothersome. I usually think about my models in terms of factor graphs. The one thing that they don't convey, though, are the normalization constraints, if you have a directed graphical models.
Yes, agreed. This is why the question was specifically about undirected graphical models vs factor graphs. Generally, I also prefer dealing with directed graphical models as-is, instead of converting to undirected (which also the Wainwright & Jordan papers do before inference).
I think that one reason is that it is (at least for me) generally simpler to intuitively grasp the model through the MRF representation. Then you can list feature templates etc to provide a more detailed description.