I once heard that TF/IDF was developed from the field of information retrieval. It is not very appropriate for text classification purposes. I am not sure how to understand this statement correctly.

asked Oct 18 '12 at 14:33

ouyang's gravatar image

ouyang
1591011


2 Answers:

I don't agree with that statement. It is common to apply TF/IDF before doing text classification.

answered Oct 19 '12 at 14:42

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

I'm not sure I understand the question here...TF/IDF is a way to quantify how significant a particular word is to a particular document in a given document set. Perhaps, that's why whoever it was told you that it's not good in text classification. However, you could use TF/IDF as a feature in text classification. For example, in spam filtering, you could use TF/IDF of each word in an email as a feature, or you could use some function and/or aggregation of the TF/IDF of each word in an email...

answered Oct 20 '12 at 01:02

akobre01's gravatar image

akobre01
1

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.