|
I need to evaluate my implementations of Turney's web PMI algorithm, and Hu & Liu's wordnet alorithm for analyzing the sentiment orientation of adjectives. I was wondering - is there a test data set out there I can use, or some other way I can get a good handle on how well the algorithms are working? (I'm looking at them in the context of customer reviews of cameras.) |
|
There's the general inquirer dictionary, and the MPQA corpus has been annotated with word polarity information. thanks! Also MPQA wins for least helpful annotation format ever.
(Jul 07 '10 at 15:54)
aditi
Yes, true enough. I wish I knew any better sentiment dataset with word-level annotations, but I don't.
(Jul 07 '10 at 16:00)
Alexandre Passos ♦
General Inquirer is old and is best avoided. Peter Turney did large scale polarity experiments. Try contacting him.
(Jan 03 '12 at 21:44)
Delip Rao
|
|
According to Rion Snow et al ("Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks"), you can turk annotate this sort of NLP task for roughly $1 per 1000 labels, and 5 human annotators achieve the accuracy of one expert annotator (IIRC). So you can probably turk a bunch of labels, which hopefully you would then share with the community. I believe he also released the data that he turked when he wrote the paper. I forget if he did this exact task. |
|
There is sentiwordnet. However, both the MPQA and SentiWordnet resources are built on datasets very different from the product reviews you are looking at. I would recommend using manual judgments based specifically on your data. You could compare to MPQA/SW if you want. |
|
There's also the Dictionary in Affect in Language (DAL), which has about 9k words manually annotated for 'Pleasantness' 'Activation' and 'Imagery'. |