|
For binary classification tasks, I noticed that a lot of studies applying machine learning never mention the distribution of the positive and negative classes. This seems very problematic because it's often unbalanced. Instead, they give results of a classifier's performance on permuted labels. Doesn't this permutation test only show a particular classifier's ability to learn the label distribution of the entire training set? In my mind, this doesn't give any information about usefulness of the classifier since it hasn't been compared to the most trivial classifier (which always outputs the class which is a majority in the training set). Is this a valid complaint? |
|
I'm not sure this is true. Most recent work on binary classification either explicitly compares against a random baseline or uses a performance metric such as AUC or precision/recall that is robust to highly asymmetric classes. Or are you talking about old papers or papers published in low-tier conferences? This particular work was published in PLoS One in 2011. They do provide a baseline, which is their classifier's performance on a dataset with permuted labels and they show their performance to be statistically significant against this. But I don't think you can assume their classifier (an SVM) will be better than the trivial classifier with permuted labels.
(Mar 06 '12 at 13:11)
crdrn
|
It's often trivial to beat such a classification (every prediction is the largest class). This is sometimes called the chance rate. A better trivial classification is the k-nearest neighbours with k=1 (i.e. predict the closest matching instance from training).
On the other point, the true positive and false negative rates are often given for classifications in highly unbalanced datasets, along with sometimes a confusion matrix. However you are right in that this can sometimes not be done for unbalanced datasets.
Thanks for the response. In many medical applications of ML, demonstrating effect beyond the chance rate isn't as easy which is why I'm particularly bothered when they don't provide such information in their publication to indicate so.
They argue that their permutation of the labels shows that their classifier (in this case an SVM) is learning something from the data which is true. But they haven't proven that this is a good or useful classifier in general.
Can you name names? Which papers are you talking about? Often you will see naive attempts at using ML, and they should not be taken seriously.