Hi ,

How can I control the overfiting in the logistic, nearest neighbors and SVM classifiers?

If I were using a neural network, I could use early stopping to avoid overfitting. In the case of SVM I think the procedure is to control the value of C. An small value of C would help the generalization and avoid overfit ( is this correct?)

But suppose I have one-nearest-neighbor(1-NN) and one dataset. How can I control the overfit? There is no parameters to chose. The same question go to the logistic classifier. I can try different partitions folds in the cross-validation to verify the performance and check if the overfitting occured but how can I avoid it?

asked Jul 03 '13 at 12:59

Jorge%20Amaral's gravatar image

Jorge Amaral

You are correct about C in the context of support vector machines. Here C controls the amount of penalization of the non-negative slack variables. Therefore, for small values of C, the amount of regularization is high and vice versa.

(Jul 03 '13 at 16:48) Michael Riis Andersen

One Answer:

1nn you use more - k nearest neighbours

logistic can/should have a C value too - ie norm on weights - see eg you can use an L1 or L2 norms logistic regression scikit learn

answered Jul 03 '13 at 13:39

SeanV's gravatar image


edited Jul 03 '13 at 13:40

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.