In 5th chapter of Mining Text Data [http://rd.springer.com/chapter/10.1007%2F978-1-4614-3223-4_5#] it is mentioned that the Dirichlet distribution has following property:
The Dirichlet distribution favors imbalanced multinomial distributions, where most of the probability mass is concentrated on a small number of values. As a result, it is well suited for models that reflect commonly observed power law distributions in human language.

I am not able to understand how does the Dirichlet distribution reflects power law distributions in human language.
Request you to elaborate the connection between the Dirichlet distribution and human language.

I also request you to provide me intuition behind the use of Dirichlet distribution in LDA.
Please also give me pointers to appropriate resources.

asked May 27 '13 at 05:25

swapnilhingmire's gravatar image

swapnilhingmire
31151516


One Answer:

I believe this sentence is false. DIrichlet distributions do not obey power laws. There is, however, an adaptation of a dirichlet process which can be made to obey a power law, the Pitman-Yor process.

answered May 27 '13 at 08:14

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.