Trying to compile a resource for a class I'm teaching. I can come with 5 or so objectives for linear models, and 3 or 4 regularization losses. Someone must have compiled a more thorough list. Can anyone give me a pointer?

Downer

asked Mar 30 '12 at 14:47

downer's gravatar image

downer
54891720

It'd be nice if you included your list as an answer to this question, as then people can avoid repeating what you know and the answers become more useful to future readers.

(Mar 30 '12 at 15:14) Alexandre Passos ♦

I'm not quite sure about your terminology.

Afaik: objective = loss + regularizer

So I'm not sure what regularization losses are....

(Mar 30 '12 at 16:44) Andreas Mueller

One Answer:

Some popular loss functions are:

  • absolute loss, |x - y|
  • epsilon-insensitive loss, max(0, |x-y| - epsilon)
  • squared loss, (x - y)^2
  • squared epsilon-insensitive loss, max(0, (x-y)^2 - epsilon)
  • 0-1 loss (for y discrete), I{x = y}
  • hinge loss (for y = +-1), max(0, 1 - yx)
  • squared hinge loss (for y = +-1), max(0, 1-yx)^2
  • multiclass or structured generalizations of the hinge or squared hinge loss
  • exponential loss (for y = +-1), exp(-yx)
  • logistic (or log) loss (for y = +-1), log(1 + exp(-yx))
  • the softmax generalization of the logistic loss, exp(f(x,y)) / (sum_i exp(f(x,y_i)))
  • ramp loss (for y=+-1) , min(1, max(0, 1-yx))
  • Huber loss (follow the link, as it looks ugly in this notation)
  • the KL divergence (for x and y positive summing to one), sum_i x_i log(x_i / y_i)
  • in general, the Bregman divergence of any convex function

Some regularizers:

  • any p-norm (square norm, l1 norm, the maximum, etc)
  • the "l_0" norm (number of nonzero components of the vector)
  • combinations of the above (square norm + l1 norm = elastic net, compositions of norms over norms lead to structured lasso, etc)
  • the entropy, - sum_i x_i log x_i
  • again, generally, any Bregman divergence with regard to a fixed base vector

Of course, this list is incomplete, but I hope I've covered most used in practice.

answered Mar 30 '12 at 15:28

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

That's a nice list :)

Are there any losses that are commonly used that are not Bregman divergences? If not, you're list is complete ;)

(Mar 30 '12 at 16:43) Andreas Mueller

The ramp loss (which I cited) is not a bregman divergence, nor is the 0-1 loss (which I also cited). I think if you look in papers with "robust" in the title you'll find more nonconvex things which are used.

(Mar 30 '12 at 17:11) Alexandre Passos ♦
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.