|
I want to compare my semi-supervised text classification algorithm with the GE-FL algorithm in “Learning from labeled features using generalized expectation criteria” (http://people.cs.umass.edu/~mccallum/papers/druck08sigir.pdf). |
|
I believe labeled features were discovered from the full dataset in some GE papers solely because getting humans to label data was too expensive for most experiments. Hence if you want a fair comparison I suggest you either give both algorithms human features or features extracted from the labeled data in similar ways. Regarding your example it's hard to tell from a single point. What I'd do is plot a curve of F1 versus number of labeled features, and see which algorithm outperforms the other. Thanks Alexandre! My next doubt is what are the actual labeling efforts involved in the GE-FL algorithm when the features are labeled by the oracle? (# features or # training documents)
(Sep 18 '13 at 06:04)
swapnilhingmire
|