I'm trying to understand the general CRFs and I don't understand how should we obtain the factors (factor functions f(x_A, y_A)). Currently I have the next ideas:

1) we do it manually, because it's our task to describe the model and CRF can just optimise parameters of the model

2) it's should be done by some algorithm

3) we think that we have all possible factor functions and if some of them are unimportant then their weights will be estimated as zero or around

asked Jul 08 '10 at 04:59

Sergey%20Bartunov's gravatar image

Sergey Bartunov
81111116

edited Jul 10 '10 at 16:29

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146


3 Answers:

It depends on how the CRF is represented. If it's as a factor graph then you must define the factors when specifying the graph, so (1) is correct. If it's as a random field (undirected graphical model), then it's a mixture of (1) and (2). You do it manually, by specifying the topology of the conditional random field. But this doesn't specify necessarily all the factors you have, and an algorithm will then search for all the cliques in the graph and generate one factor for each clique. At least this is how it works for general Markov random fields.

This answer is marked "community wiki".

answered Jul 08 '10 at 07:11

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

edited Jul 08 '10 at 08:14

Does it mean that I have to define "clique templates" only?

(Jul 08 '10 at 07:51) Sergey Bartunov

I'm not sure what you mean by clique templates. What you should do is define the edges of the CRF, and the cliques are sets of nodes that have edges from anyone to any other one.

(Jul 08 '10 at 07:52) Alexandre Passos ♦

Seems that I misunderstand something. Are you speaking about edges in the context of the CRF's factor graph? As I thought there is no edges in a factor graph other than edges that connect variable node to a factor node, so to define edges is to define a factors. Am I wrong?

(Jul 08 '10 at 08:10) Sergey Bartunov

Yes, you're correct. I thought you were talking about the random field representation of the CRF, not the factor graph. If you're representing it as a factor graph, then, yes, you have to define the factors manually. I'll edit the answer to reflect that.

(Jul 08 '10 at 08:12) Alexandre Passos ♦

For a slightly different way of thinking about this, you might be interested in FACTORIE by the McCallum group at UMass. The twist is that the factor graph is defined imperatively. That is, when a new state is being considered for inference, you supply code that says which factors need to be recomputed, which other variables are involved in the computation, etc. Of course this implicitly corresponds to some actual graph topology, but in some cases the imperative/procedural definition may be much easier to work with. For one thing, your procedural definition can easily prevent "illegal" variable configurations.

answered Jul 08 '10 at 13:14

David%20Andrzejewski's gravatar image

David Andrzejewski
20635

Are you basically asking: What factor graph topology should I used in a general CRF? It's a good question. It is, unfortunately, typically done by hand. Just think of topologies (sets of factors) that make sense and evaluate them.

Often there is a particular topology that makes sense, like the factor graphs used in skip-chain CRFs: The point here is to add certain factors to the usual trigram factors: These added factors connect the output variables for input variables that represent the same input tokens, so that the tag for "Obama" in the beginning of the sentence can influence the tag for "Obama" at the end of the sentence, for example.

It would have been hard to find such kind of factor graph topology automatically. But in general, one can imagine an algorithm that starts with few factors and adds more and more factors greedily. I would be very interested if someone has a reference for this.

answered Jul 08 '10 at 09:09

Frank's gravatar image

Frank
1349274453

edited Jul 08 '10 at 23:59

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.