Hi,

I read that normalizing data would yield better results for libsvm. I wanted to make sure, if libsvm implementation in scikit-learn, internally normalizes data.

Thank you

asked Apr 22 '13 at 15:40

Tez's gravatar image

Tez
1111

edited Apr 27 '13 at 09:07

larsmans's gravatar image

larsmans
67651424

You have a better chance to get an answer if you add directly to the sklearn mailing list

(Apr 22 '13 at 17:07) Leon Palafox ♦

2 Answers:

The section "3.2.5. Tips on Practical Use" on this page http://scikit-learn.org/stable/modules/svm.html recommends to scale the data before using. This document doesn't tell anything about internal scaling. I don't think the scikit-learn library will scale normalize the data internally. Here is the document which the above page refers to http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing

answered Apr 25 '13 at 04:47

phoxis's gravatar image

phoxis
317711

1

Thank you. Yeah I too saw those. But when I dig into implementation of libsvm, I think I saw normalization, so was not sure, if we require another normalization.

(Apr 26 '13 at 14:56) Tez

there is a separate tool svm-scale to scale data.

(Apr 26 '13 at 16:21) phoxis

No need to use a separate tool (that isn't shipped with scikit-learn). The functionality is all there.

(Apr 27 '13 at 09:03) larsmans

You have to do scaling or normalization separately. The preprocessing module has various classes and function for this:

  • MinMaxScaler scales individual features so that the min and max in the training become -1 and 1, respectively
  • StandardScaler removes the mean, then scales to unit variance (per feature)
  • Normalizer normalizes entirely samples to unit length according to the L1 or L2 norm

Normalization is appropriate when handling things like term frequencies. For simple Gaussian or nearly-Gaussian features, use scaling.

Pipeline objects make these a lot easier to use:

svm = Pipeline([('scale', MinMaxScaler()),
                ('svm', SVC(kernel="rbf"))])

answered Apr 27 '13 at 09:06

larsmans's gravatar image

larsmans
67651424

edited May 02 '13 at 12:37

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.