I need to evaluate my implementations of Turney's web PMI algorithm, and Hu & Liu's wordnet alorithm for analyzing the sentiment orientation of adjectives. I was wondering - is there a test data set out there I can use, or some other way I can get a good handle on how well the algorithms are working? (I'm looking at them in the context of customer reviews of cameras.)

asked Jul 07 at 13:33

aditi's gravatar image

aditi
57611125

edited Jul 07 at 14:05


4 Answers:

There's the general inquirer dictionary, and the MPQA corpus has been annotated with word polarity information.

answered Jul 07 at 14:25

Alexandre%20Passos's gravatar image

Alexandre Passos
4138105094

thanks! Also MPQA wins for least helpful annotation format ever.

(Jul 07 at 15:54) aditi

Yes, true enough. I wish I knew any better sentiment dataset with word-level annotations, but I don't.

(Jul 07 at 16:00) Alexandre Passos

According to Rion Snow et al ("Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks"), you can turk annotate this sort of NLP task for roughly $1 per 1000 labels, and 5 human annotators achieve the accuracy of one expert annotator (IIRC).

So you can probably turk a bunch of labels, which hopefully you would then share with the community. I believe he also released the data that he turked when he wrote the paper. I forget if he did this exact task.

answered Jul 07 at 20:27

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
2290195575

There's also the Dictionary in Affect in Language (DAL), which has about 9k words manually annotated for 'Pleasantness' 'Activation' and 'Imagery'.

answered Jul 13 at 09:34

Andrew%20Rosenberg's gravatar image

Andrew Rosenberg
75921021

There is sentiwordnet. However, both the MPQA and SentiWordnet resources are built on datasets very different from the product reviews you are looking at. I would recommend using manual judgments based specifically on your data. You could compare to MPQA/SW if you want.

answered Jul 13 at 09:25

Jeff%20Dalton's gravatar image

Jeff Dalton
1

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.