Revision history[back]
click to hide/show revision 1
Revision n. 1

Jul 01 '10 at 14:04

Paul%20Mineiro's gravatar image

Paul Mineiro
91115

subsampling and AUC

consider a binary classification task where negative labels are more frequent than positive labels, e.g., negative labels are 10 times more likely apriori. i have this belief that if i subsample the negative labeled instances so that i have a test set that is approximately balanced, and compute an AUC on the subsampled test set, that in expectation i will get the same AUC as if i computed AUC on the complete test set (i.e., with many more negatives than positives).

is this known to be true?

click to hide/show revision 2
Revision n. 2

Jul 02 '10 at 15:39

Joseph%20Turian's gravatar image

Joseph Turian
470541105127

Does subsampling and AUCbias the AUC?

consider Consider a binary classification task where negative labels are more frequent than positive labels, e.g., negative labels are 10 times more likely apriori. i I have this belief that if i I subsample the negative labeled instances so that i have a test set that is approximately balanced, and compute an AUC on the subsampled test set, that in expectation i I will get the same AUC as if i I computed AUC on the complete test set (i.e., with many more negatives than positives).

is Is this known to be true?

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.