4
3

The question is related to this question. However, I'm seeking some practical introduction to Dirichlet Processes or non-parametric methods in general. If I have reviewed some introduction papers or surveys (usually lack of details). What is the next step? How can I get some hands-on experience ? For example, what algorithm is reasonable for a beginner in this area to implement and play with?

Do we have some public code for Dirichlet Processes already?

Thanks and any suggestions are welcome.

asked Aug 13 '10 at 01:20

Liangjie%20Hong's gravatar image

Liangjie Hong
256101720

edited Aug 13 '10 at 01:23


3 Answers:

As for Dirichlet Processes, inference is a key issue. Radford Neal's paper contains a number of MCMC algorithms for inference in DP mixture models. I'd suggest you to read it and try implementing some of them on your own.

Besides, Yee Why Teh has some nice tutorials (and code mostly consisting of examples of DP style models) that you can look at. Some more DP and hierarchical DP code is available here.

As for other nonparametric Bayesian models, Hannes Nickisch and Carl Rasmussen recently released some public code for Gaussian Processes, which also contains lots of practical details specific to implementation issues.

For Indian Buffet Process (IBP) models, there is a technical report explaining the basics (+Gibbs sampling details) in reasonable detail which you should find useful to begin with. Some inference code (both variational and MCMC) is also available here. As for models based on IBP, you can try implementing models such as inferring hidden causes or infinite sparse factor analysis.

answered Aug 13 '10 at 01:54

spinxl39's gravatar image

spinxl39
3698114869

edited Aug 13 '10 at 01:58

@spinxl39, wow! This is indeed very informative! Thanks a lot!

(Aug 13 '10 at 02:00) Liangjie Hong

Dan Klein's tutorial on nonparametric bayes with variational inference is as practical as you can get---it shows how to understand variational algorithms, how to use complex DP and HDP models, and how to switch from using EM to doing variational inference on a bayesian model.

answered Aug 13 '10 at 12:44

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

For hands on experience you can try GibsLDA++ on some sample corpus.

answered Aug 13 '10 at 01:43

ashish's gravatar image

ashish
150347

@ashish. Thanks. I understand LDA and its extensions. What I'm seeking here is non-parametric method and Dirichlet Process.

(Aug 13 '10 at 01:45) Liangjie Hong

@Liangjie: LDA is non-parametric, it automatically infers the model size/complextiy (number of tags) from the data. Or is it not in some stricter sense?

(Aug 13 '10 at 10:04) Frank

@Frank: LDA as it was originally presented is parametric, since you need to pre-specify the number of topics. However, there is a trivial extension to LDA ussing hierarchical dirichlet processes called HDP-LDA that is nonparametric and infers the number of topics automatically. It's specification is slightly odd, but here it is:

all_topics ~  DP(alpha_1 , Dirichlet(all_words, beta))
topics_d  ~ DP(alpha_2 , all_topics)
z_di ~ topics_d
w_di ~ Discrete(z_di)

Which means that there is a global set of topics, from which the topics from each document are sampled. Each word then chooses a topic from that document's topic distribution (which is a DP, with the global DP as its base measure) and samples a word from that topic (which itself is a dirichlet distribution over words).

(Aug 13 '10 at 10:18) Alexandre Passos ♦
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.