3
1

Has anyone experience with explicit feature maps for SVM training? There has been some recent process that looks very promising, but I haven't seen the techniques used much. I am implementing and trying them out at the moment and I was wondering if anyone has tried these techniques yet.

Also, I am interested in their relation to "encoding" techniques like Locally linear embeddings and Fisher vectors. Has anyone compared these different approaches to transform the feature space?

There is an interesting survey, but it does not compare against explicit feature maps.

This question is marked "community wiki".

asked Nov 11 '11 at 06:31

Andreas%20Mueller's gravatar image

Andreas Mueller
2686185893

Thanks for sharing this, I never heard of the Fourier series approach to kernels before. That looks pretty cool!

(Nov 13 '11 at 05:17) Oscar Täckström

If you are interested, there is matlab code by Fuxin, the author of the "skewed chi2" paper here: http://sminchisescu.ins.uni-bonn.de/code/randfeat/ I am currently playing with it in python and want to include it into scikit-learn. You can find my branch here: https://github.com/amueller/scikit-learn/tree/kernel_approximation and the code here: https://github.com/amueller/scikit-learn/blob/kernel_approximation/sklearn/feature_extraction/kernel_approximation.py It currently implements the rbf and skewed chi2.

(Nov 13 '11 at 05:29) Andreas Mueller
1

Check out Liefeng Bo's page: http://www.cs.washington.edu/homes/lfb/

He has a similar fourier approximation scheme, and it's meant for use with vision applications, where large feature vectors tend to limit us to linear SVMs.

(Nov 14 '11 at 00:09) Jacob Jensen

Thanks for the link. Which paper are you referring to?

(Nov 14 '11 at 03:30) Andreas Mueller
1

Have you looked at the random kitchen sinks http://berkeley.intel-research.net/arahimi/papers/basis2.pdf ? Have you looked at hash kernels for structured learning http://hunch.net/~jl/projects/hash_reps/hash_kernels/hashkernel.pdf ?

Both these things are very similar to explicit feature maps and perform reasonably.

There are also more efficient ways of training SVMs if one is allowed to use these maps. See for example http://arxiv.org/abs/1111.0432 , which doesn't sound like explicit feature maps but actually uses them in their results.

(Nov 14 '11 at 09:36) Alexandre Passos ♦

I know the random kitchen sinks but not the other two. Did you use any of these techniques yourself?

(Nov 14 '11 at 09:42) Andreas Mueller

No, not so far. I'd be interested in seeing these things applied, however.

(Nov 14 '11 at 09:50) Alexandre Passos ♦

In http://arxiv.org/abs/1111.0432 it seems to me the time comparisons are unfair, as they use another stopping criterion for their methods as for the others. What do you think about that?

(Nov 14 '11 at 11:03) Andreas Mueller
1

I don't have much experience tuning stopping criterion for the solvers compared there.

In section 4.2 they ran their solvers until they achieved comparable accuracy to those other solvers. If an order of magnitude of runtime on the other solvers did not improve test error I'd say their default stopping criteria are pretty bad. In section 4.3 they only compared against LASVM, and one pass only.

I'm not terribly bothered by the results because the margin is huge, so it does not seem like they won on a technicality.

(Nov 14 '11 at 14:53) Alexandre Passos ♦
showing 5 of 9 show all
Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.