2
2

What SVM packages are capable of handling large datasets? In particular I'd like to use the RBF kernel to perform nonlinear classification? My "large data set" is on the order of hundreds of thousands of data points. I've found libsvm and svmlight, but am sure there are others. What would people recommend?

asked Mar 29 '13 at 15:24

C%20I's gravatar image

C I
566811

edited Mar 29 '13 at 15:25


3 Answers:

For large data sets, I tend to default to LIBLINEAR, but if you really need the properties of an RBF kernel that is not going to work for you. A colleague of mine who was training discriminant models for speech recognition said that he could never get LibSVM to converge on his data, but switching to Core Vector Machines drastically reduced training time and gave him a large bump in performance.

answered Mar 29 '13 at 17:19

leebecker's gravatar image

leebecker
9622

Third option: approximate the RBF kernel and use a linear learner such as LIBLINEAR. In my experience, a 100k samples will kill LibSVM.

answered Apr 04 '13 at 09:12

larsmans's gravatar image

larsmans
67651424

Another efficient option is lasvm. Svmlight should work with hundreds of thousands of points, though.

answered Mar 29 '13 at 20:55

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Do you have any experience with LIBSVM? How much data is too much data for each of these packages? (roughly)

(Mar 31 '13 at 14:48) C I
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.