1
1

Hello,

I'm interested in optimizing complex loss functions for structured prediction problems like image segmentation. I came across the following paper : Optimizing Complex Loss Functions in Structured Prediction. They optimize complex loss function like the intersection over union for image segmentation but the authors use several approximations and the experimental section only reports results for some classes on the Pascal VOC dataset so it's hard to judge how well this method performs.

Another paper (Direct Loss Minimization for Structured Prediction) directly optimizes the loss in structured prediction which works fine for the speech recognition task presented in the paper but is not really applicable for image segmentation because they still require the loss-augmented inference step.

I was wondering if anybody knew of any other work trying to optimize complex loss functions ?

asked Dec 12 '11 at 14:20

Aurelien%20Lucchi's gravatar image

Aurelien Lucchi
31344


4 Answers:

This workshop paper takes the approach of treating complex loss functions as high order potentials, then develops algorithms so that the potentials can be used generically in a message passing formulation. Under this view, the loss augmented inference problem is approximately solved by message passing in a factor graph that includes both model and loss terms. There aren't many specific details in this paper, but it looks like a longer conference version will appear at AISTATS 2012.

answered Dec 13 '11 at 18:21

dt_'s gravatar image

dt_
21636

Another alternative, if sampling is a good inference method for your problem, is SampleRank. The idea behind samplerank is that, as you sample new states during inference, you make your learned model match the preferences of your loss function on those states (that is, if before you had a higher loss than after, you will make an update such that the before state will have a lower score than the after state). These updates are ridiculously efficient because you only have to look at a subset of the variables and a subset of the factors that touch those variables to evaluate its quality. It also works for essentially any loss function you can write code to evaluate (and if your loss doesn't easily decompose you only get a speed penalty, not an accuracy problem).

answered Dec 13 '11 at 03:11

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

I think the most commonly used method is Thorsten Joachims SVM^struct.

At the bottom of the page you will find references to the papers describing various methods that are used in the software.

This answer is marked "community wiki".

answered Dec 15 '11 at 11:45

Andreas%20Mueller's gravatar image

Andreas Mueller
2686185893

The downside of the SVM struct approach is that it relies on solving a loss-augmented inference at each iteration of the cutting-plane algorithm. If the loss function is complex, this loss-augmented inference might be intractable so one has to resort to alternatives like sampling. The SampleRank paper seems to be a good option. I had a look at the workshop paper but some details were a bit unclear so I'll wait for the AISTATS paper.

answered Dec 15 '11 at 16:20

Aurelien%20Lucchi's gravatar image

Aurelien Lucchi
31344

The SampleRank paper I linked to in my anwer is from ICML 2011 and as far as I know is the state of the art in SampleRank. It is quite formal about what is the objective function behind it, what are the convergence guarantees, complexity, etc.

(Dec 15 '11 at 16:22) Alexandre Passos ♦

I'd like to reimplement the SampleRank paper in c/c++. Does anybody know a good library for Gibbs sampling ? I'm looking for something fast and with as few dependencies as possible.

(Dec 16 '11 at 03:06) Aurelien Lucchi
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.