|
Hi, I have 1 million tweets that I would like to classify by category(i.e. music, movies, spam, etc - there will be 5 categories). What's a good first-cut algorithm to use? I am looking for 80%+ accuracy, so something easier to implement and tweak would be more preferable to the state of the art black-box. I work in Python, so any good examples/tutorials or useful links are welcome. :) Thanks! |
|
Concerning features, this question can also be relevant: http://metaoptimize.com/qa/questions/8614/tweet-classifier-features-in-nltk |