Typically kernel SVMs are penalized by L2 norm (I'm thinking of the primal representation here).

Is there a point in L1 penalization here? Unlike in non-kernelized algorithms (lasso), you're not really penalizing the feature weights but the weights associated with samples, because of the kernel.

asked Feb 05 '14 at 20:45

digdug's gravatar image

digdug
245111620

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.