Could someone point me to a good review paper or benchmark results how the present best methods perform? On "small" scale(Reuters 21578, 20-Usergroups ..)

What I had managed to find was an archaic 2001 comparison: http://arxiv.org/pdf/cs.ir/0110053

and an otherwise interesting(for me) paper, but the benchmark is just a side-note:
http://www2009.eprints.org/21/1/p201.pdf

Thanks for your help!

asked Nov 22 '11 at 05:56

M%C3%A1t%C3%A9%20T%C3%B3th's gravatar image

Máté Tóth
31225

1

What really matters is which features you're using and how well they correlate with the target labels. Apart from that, if you have very few training examples use something generative such as naive bayes, if you have far less examples than features you can use something l1-regularized, otherwise l2-regularized linear classifiers (logistic regression, SVMs) do as well as each other. The precise differences will depend on the actual dataset used, but these are the general trends.

(Nov 22 '11 at 08:51) Alexandre Passos ♦
Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.