Revision history[back]
click to hide/show revision 1
Revision n. 1

Jul 14 '10 at 18:17

aditi's gravatar image

aditi
85072033

Using shrinkage esitmators to mitigate small sample size in sentiment analysis

I'm trying to pin a sentiment score (-1 to +1) on the features of a product by using the sentiment scores of the adjectives that describe the features. Right now, I do this by averaging the sentiment scores (+1 -1 or 0) of all the adjectives that apply to a feature. The problem is that some features have a lot (>20) adjectives, where others have just 1 or 2. I want to shrink my estimated sentiment towards some average value in cases where there aren't that many observations - what are some principled ways to do this, and where do I learn more?

click to hide/show revision 2
elaborated on question

Jul 14 '10 at 18:22

aditi's gravatar image

aditi
85072033

Using shrinkage esitmators to mitigate small sample size in sentiment analysis

I'm trying to pin a sentiment score (-1 to +1) on the features of a product by using the sentiment scores of the adjectives that describe the features. Right now, I do this by averaging the sentiment scores (+1 -1 or 0) of all the adjectives that apply to a feature. The problem is that some features have a lot (>20) adjectives, where others have just 1 or 2. I want to shrink my estimated sentiment towards some average value in cases where there aren't that many observations - what are some principled ways to do this, and where do I learn more?

A secondary problem is that the sentiment scores attached to adjectives only have about 75% F-1, so a -1 (negative) or +1 labeled adjective may actually be neutral. While this may not be a problem when there are a lot of adjectives describing a feature, how do I adjust for it when there are only a few? Can I shrink these scores as well?

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.