Let's say I implement a method for a certain object classification task, maybe like classifying images as Foo or non-Foo. So I extract the HOG and some other features from the training data. To reduce the dimension I apply PCA. Then use a linear SVM classifier. Using different PCA dimensions will give me different classification accuracy on the test set. So I use the optimal dimension and report my test result as the state of the art.
My question: is it really fair to use the test set to choose the PCA dimension (or whatever parameters the HOG or SVM have) Shouldn't I use a validation set instead to determine the optimal parameters and then use these parameters for my classifier on the test set and report this result?

asked Jun 10 '14 at 04:22

Ng0323's gravatar image

Ng0323
1567915

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.