4
5

In the paper Determining the Sentiment of Opinions there is a need to find seed negative and positive verbs. I was able to get seed positive and negative adjectives combining some resources and sentiwordnet Howeever, I need a list of some 20 positive and 20 negative verbs which can act as a seed to find more verbs using Wordnet. How can get such a seed list?

asked Jul 17 '10 at 06:26

ArchieIndian's gravatar image

ArchieIndian
9951011

edited Jul 17 '10 at 13:06

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
576051125146


6 Answers:

I've actually been doing sentiment analysis for the last week, so here's my input: 20 positive and 20 negative is very, very, small, and WordNet gives terribly small results if you try to grow a bigger list from a seed list.

Instead I used OpinionFinder's huge (6000-word) subjectivity lexicon here (the Subjectivity Lexicon link), and then I found that a PMI-based approach al-la Turney (a bunch of papers all with the same general web-based PMI idea) worked very well. You can make this approach adaptable to context by simply including the topic words in the search queries.

Also, depending on what computational resources you have, or how much data, you can do better than Kim and Hovy's paper by dependency parsing your sentences, so you know exactly which adjectives and verbs are referring to which "topics", and can detect negation extremely well.

After that, a simple beta-binomial model (see this question) works better than averaging the sentiment scores in any way (Models 1 and 2 in the paper), especially when there's not much data and your sentiment classifier doesn't always work correctly.

answered Jul 18 '10 at 15:35

aditi's gravatar image

aditi
85072034

Thanks for the information. Can you suggest me a few must read papers.

(Jul 19 '10 at 21:42) ArchieIndian
2

Okay, so the best survey I've found is Pang and Lee's "Opinion Mining and Sentiment Analysis" [google scholar]. In addition to being informative, it contains tons of references to the various different approaches. From there, you can decide what to read based on what's appropriate for your situation.

More information on extracting features, look at Popescu & Etzioni's "Extracting Product Features and Opinions From Reviews" [google scholar].

(Jul 20 '10 at 14:45) aditi

Thanks a lot :)

(Jul 20 '10 at 22:32) ArchieIndian

If it's just about 20+20 verbs, do it manually, it will take 10 minutes. Start with

{love, like, praise, enjoy, share, hug, visit, remember, dazzle, help, ...} and

{hate, dislike, die, cease, ignore, warn, provoke, kill, pay, require, ...}.

You can also look at amazon.com or imdb.com positive and negative reviews and get some verbs from there.

The problem with these tasks is that most words can be positive or negative, depending on the contexts. That's even more true for verbs than for adjectives because for negative verbs, we often use a positive verb together with negation ("didn't help"), where the corresponding adjective could be directly negative ("unhelpful"). So you should definitely look at negation as a feature in your model.

answered Jul 17 '10 at 09:04

Frank's gravatar image

Frank
1319274453

edited Jul 17 '10 at 09:05

My suggestion would be to use the movie reviews corpus that is included as part of the nltk. There are 2000 documents labeled as positive and negative. You could use raw counts to see which words that WordNet identifies as verbs appear most frequently in one set and not in the other.

answered Jul 18 '10 at 06:47

Joel%20H's gravatar image

Joel H
6123

You can look at some larger polarity dictionary, like the general inquirer. It seems to have verbs. Or, if you just need 20 words, grab a random plog post and tag some verbs manually.

answered Jul 17 '10 at 08:10

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2549653277421

We wrote a paper on automatically extracting polarities of subjective words from Common Sense Knowledge Base (in our case ConceptNet). This is a link to our paper. We validate our approach by comparing to the MPQA Subjectivity Lexicon as mentioned by Aditi.

answered Oct 10 '10 at 01:00

Dexter's gravatar image

Dexter
416243438

The one which i used for doing sentiment analysis on twitter was the list on http://twitrratr.com . Its pretty neat is what i feel. Also , something that you did not ask for but i am saying anyway. A huge %age of words on twitter are misspelt. So , do keep in mind to use a good spellchecker or no number of wordlists will be able to negate that. Hope that helps.

answered May 20 '11 at 06:07

crazyaboutliv's gravatar image

crazyaboutliv
15061015

edited May 20 '11 at 06:09

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.