Clearly, a dumb classifier which predicts the majority class could win against some sophisticated and elaborated models; one can find an introductory discussion at http://www.ml-class.org/course/video/list, course IX.3.

How could one deal with such datasets? For example, using the F1 score while evaluating the model's performance might be of help? are there any other approaches?

Disclaimer: the same question exists on quora.com, but no answer was received so far.

asked Feb 17 '12 at 02:34

Lucian%20Sasu's gravatar image

Lucian Sasu
513172634

1

As an aside, the performance of these 'dumb' models depends highly on your evaluatiuon method. The F-Score would give favor to the single class method, but other metrics such as V-Measure and Adjusted Mutual Information might give you other insights as to how other models perform as they typically score methods that predict a single class with 0

(Feb 17 '12 at 13:02) Keith Stevens

2 Answers:

I think you answered your own question! If you are only measuring performance using accuracy, then you miss some of the other dimensions. A simple technique I have used is just using AUC. Based upon your problem description, at the least you should probably use precision as well as accuracy as performance measures. F1 may work also, though I have not tried it for this purpose.

Another technique would be to treat the data itself. I'm not sure what your data looks like, but you could do some pre-processing on it to offset the dominant class.

answered Feb 17 '12 at 09:39

Ryan%20Kirk's gravatar image

Ryan Kirk
46124

Sampling is often used to deal with this, where you either ignore a lot of the majority class points (under-sampling), or duplicate the minority class points (oversampling or SMOTE). I've used ensemble sampling to some success -- in this you construct an ensemble of classifiers from a variety of undersampled datasets and fuse the results. Other approaches simulate this by weighting the importance of minority class points higher than majority class points.

Though both of these approaches suffer a mismatch between the objective function used during training, and the evaluation function (F-measure, for example). Martin Jansche has a paper that optimizes F-Measure directly: “Maximum Expected F-Measure Training of Logistic Regression Models”, Martin Jansche, Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005, pp. 692-699.

I'm sure that there are more efforts that do this too. But I don't have the citations quite so at hand.

answered Feb 20 '12 at 08:26

Andrew%20Rosenberg's gravatar image

Andrew Rosenberg
173772540

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.