0
2

I am reading paper on Merge Split Sampler for Conjugate Dirichlet Processes by David. B. Dahl, it says that Gibbs sampler makes local moves one at a time and hence gets stuck in local modes and thus mix poorly. What exactly is meant by this statement?

asked Oct 21 '11 at 07:36

Lancelot's gravatar image

Lancelot
250172426


2 Answers:

Let's break the statement down and look at what is meant by local moves, stuck in local modes, and mixes poorly.

Gibbs sampling samples variables one at a time from their conditional distribution. That is, you choose a variable x_i and give it a value sampled from

P(x_i | x_1, ... , x_{i-1}, x_{i+1}, ..., x_j)

This is what is meant by local moves -- we use the current values of the un-sampled variables to chose the value of the sampled variable.

A local mode is a point in our variable space -- that is, a value of the variables -- that has higher probability than other nearby points. Gibbs sampling can get stuck in local modes. Imagine all the variables are in a local mode. Now when we come to sample a variable it is likely that we'll set it back to the same value it had before, as this is the highest probability value in the local space. The wikipedia page talks about this.

Mixing poorly means failure to converge the posterior. If Gibbs sampling gets stuck in a local mode it will not sample from the entire variable space, and thus the sample will not be representative of the posterior. In this case it is said to mix poorly.

answered Oct 24 '11 at 02:08

Noel%20Welsh's gravatar image

Noel Welsh
72631023

Thanks for the explanation, now i understand it. Einstein once said "If you can't explain it simply then you don't understand it well enough". You surely understand the concepts well.

(Oct 24 '11 at 03:40) Lancelot

Thanks! I'm glad my answer helped you.

(Oct 25 '11 at 16:25) Noel Welsh

In gibbs sampling you draw samples from a model P(a,b,c,d) via the following algorithm:

set a,b = random values chosen uniformly at random for i=1 to K, where K is large set a = random sample from P(a|b,c,d) set b = random sample from P(b|a,c,d) set c= random sample from P(c|a,b,d) set d = random sample from P(d|a,b,c) endfor

This is a special case of the Metropolis-Hastings algorithm. Note how it only updates only one variable at a time. Other Metropolis-Hastings methods sample more variables at a time, ie sample a,b from P(a,b|c,d).

This is pretty basic stuff and you ought to know the basics before reading advanced papers like the one by David B. Dahl you mention in your question. A good place to start would be chapter 11 of Pattern Recognition and Machine Learning by Christopher Bishop. Daphne Koller's book should also cover this. Or at least start with the relevant wikipedia articles.

answered Oct 21 '11 at 21:24

Ian%20Goodfellow's gravatar image

Ian Goodfellow
1072162734

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.