|
I have a classifier that might return "Don't know" on the classification of an instance and just leave the instance unclassified. What might be an appropriate evaluation for comparing this classifier with other classifiers? I was thinking the F-measure with the unclassified instance being false negatives. Does that make sense? TIA |
|
Are the unclassified instances actually false negatives? I'd say this depends on the application. One thing you can do is treat this as a three-class problem (positive, negative, unclassified) where there are no examples correctly belonging to the unclassified class but mistaking a positive or a negative with an unclassified is cheaper than mistaking a positive with a negative or vice versa. Of course, you'd have to quantify how much better one thing is than the other, and these numbers should hopefully be grounded in some actual user experiments (for example, what is the probability that a false positive goes unnoticed? and what if it is unclassified? how much time is wasted checking the unclassified things on average?). |