Hi all,

I would like to ask is there a method to determine the cut-off point in a power-law distribution? I am doing some tagging study and I only want to take the top "x" tags, so how do I determine the "x"? Can I just use some arbitrary number (e.g. top 20%) or I can set a threshold say, if count(tag) > 10?

asked Jan 07 at 22:27

cherhan's gravatar image

cherhan
190121518


One Answer:

Power laws are heavy tailed (that is, a lot of their probability mass is in infrequent things) so there is no natural cut-off point (unlike, say, normal distributions, where the tail is very light and can be safely ignored). You're better off having some downstream performance metric using which you can optimize the cutoff.

answered Jan 10 at 22:14

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1899744214335

Your answer
toggle preview

Subscription:

Once you sign in you will be able to subscribe for any updates here

Tags:

×1

Asked: Jan 07 at 22:27

Seen: 735 times

Last updated: Jan 10 at 22:14

Related questions

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.