|
Could someone point me to a good review paper or benchmark results how the present best methods perform? On "small" scale(Reuters 21578, 20-Usergroups ..) What I had managed to find was an archaic 2001 comparison: http://arxiv.org/pdf/cs.ir/0110053 and an otherwise interesting(for me) paper, but the benchmark is just a side-note: Thanks for your help! |
What really matters is which features you're using and how well they correlate with the target labels. Apart from that, if you have very few training examples use something generative such as naive bayes, if you have far less examples than features you can use something l1-regularized, otherwise l2-regularized linear classifiers (logistic regression, SVMs) do as well as each other. The precise differences will depend on the actual dataset used, but these are the general trends.