4
1

What is the current state of the art for structured learning tasks with latent variables? I know of "learning Structural SVMs with Latent Variables" by Yu and Joachims, but that is already three years old. Is there any more recent work in this field?

Also, I don't know many works on learning in CRFs (meaning maximum conditional likelihood optimization) with hidden variables. Is any one working on this?

EDIT: I am particularly interested in models with cycles.

Thanks, Andy

asked Apr 23 '12 at 10:01

Andreas%20Mueller's gravatar image

Andreas Mueller
2686185893

edited Apr 23 '12 at 11:06


4 Answers:

One nice paper is Laurens van der Maaten's on hidden-unit conditional random fields, which learns CRFs with a few strategically placed latent variables. If you have a structure on which you'd like to induce latent variables, I like the paper on sentiment analysis with dependency trees which parses a sentence, adds a latent variable at each node in the tree, an observable variable at the top, and learns with maximum likelihood. For parsing there's also Slav Petrov's work on generative and discriminative CRFs. Finally, in the less tried-and-true side, the conditional herding framework and the max-margin min-entropy models are things which also look interesting.

For practical advice, if you can do inference while maximizing over the latent variables and the output variables you should be able to use the perceptron (and if you can do two-best inference you can use passive-aggressive perceptrons or SGD for the structured SVM) algorithm to train your latent-variable model; you'll only have to be careful with initialization of the latent variables to get meaningful results, as this optimization problem is usually nonconvex.

answered Apr 23 '12 at 10:53

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Thanks Alexandre. Thanks for the pointers!

I probably should have mentioned that I am particularly interested in models with cycles. Do you know of a reference for SGD in such a model?

(Apr 23 '12 at 11:06) Andreas Mueller

Unfortunately models with cycles are harder. If you believe you can do two-best max-product inference, then the SGD update is w += (1) - regularizer (if best = true best and score(best) > score(2nd best) + 1), (2) features(best) - features(2nd best) - regularizer (if best = true and score(best) <= score(2nd best) + 1), or (3) features(true best) - features(best) - regularizer (otherwise). This requires that you can do inference, however, which is generally hard in cyclic models.

Another approach is something like contrastive divergence, but that requires sampling and is hard to get right.

(Apr 23 '12 at 11:09) Alexandre Passos ♦

Yeah, inference is usually hard. I would probably do something like QBPO and/or fusion moves for inference.

I was hoping there was something in the literature about that together with SGD.

(Apr 23 '12 at 11:15) Andreas Mueller

In addition to Alexandre's answer, there is also this paper by Hugo Larochelle and co-workers that develops latent-variable models for, among others, structured outputs.

answered Apr 25 '12 at 04:29

Laurens%20van%20der%20Maaten's gravatar image

Laurens van der Maaten
100651324

edited Apr 25 '12 at 04:30

"Structured Learning from Partial Annotations" is sort of a generalization of structured learning with latent variables. One of the improvements is the accelerated optimization strategy that reuses lower bounds across CCCP iterations. I also found a toolbox for the methods described therein. The toolbox was written in Matlab: http://mloss.org/software/view/423/

answered Apr 19 '13 at 21:22

leoharrison2001's gravatar image

leoharrison2001
11

There are several papers about this topic in ICML 2012, http://icml.cc/2012/papers/. For example, Efficient Structured Prediction with Latent Variables for General Graphical Models, Modeling Latent Variable Uncertainty for Loss-based Learning. You can have a look at those paper. There is one paper, called Structured Learning from Partial Annotations. I do not know whether it is also kind of structured learning with the latent variable. Any ideas?

answered Jul 05 '12 at 13:30

Jun's gravatar image

Jun
16225

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.