I'd like to train a multi-class classifier and have a doubt regarding the procedure for training, validating and testing it.

Here's my understanding of what should be done:

  1. Divide the dataset in a ratio of 80:20 to create train and test sets.
  2. Apply k-fold stratified cross validation on the training dataset (80% of original data) in the following manner:
  3. Divide the training set into k parts/folds, each having same class distribution as the original data set.
  4. Train the classifier using k-1 folds and validate it (assess it's performance) on the remaining fold.
  5. Repeat the previous step k times, until each fold has been treated as a test set once.
  6. Average the error rate from each iteration in the previous step, and use it to fine tune classifier parameters.
  7. Assess the performance of the best model configuration on the test set (20% of the original data).

Firstly, is this procedure correct? If it is, then which model shall be used to assess the classifier performance on test set (step 7)? As the output of step 6 would be k different models (output to each iteration of the cross validation process), which one should I use for step 7?

-A

PS: I wanted to make steps 3-6 look like substeps of Step 2, but the weird formatting options wouldn't let me do that!

asked Sep 01 '13 at 01:34

A%20W's gravatar image

A W
1334

edited Sep 01 '13 at 05:30


One Answer:

6 - calculate average and standard error of average [ie st deviation of k test errors/sqrt[k-1] ] , and use it to fine tune classifier parameters. [ the standard deviation gives you error bars -> to give you an idea of whether your parameter test profile is trustworthy or just noise]

7) retrain [a new model ] on the whole training set [80%] with the best parameters as identified by step 6)

8) Assess the performance of the best model configuration on the test set (20% of the original data).

answered Sep 01 '13 at 15:05

SeanV's gravatar image

SeanV
33629

How do I use early stopping for training MLP ? Each fold will reach the minimum validation error at different epoch. For final training (step 7) when do I stop?

(Sep 09 '13 at 11:57) Ng0323
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.