# Can a Metropolis sampler be tricked to walk into the wrong direction for extended periods of time?

 1 I'm using a Metropolis sampler to approximate the joint probability of the data likelihood and a Gaussian prior over the parameters p(X|theta)p(theta). When I use a fake prior that is always equal to 1, the algorithm behaves as I would expect by initially finding some mode of high probability and wandering around in that area. However, when I add the prior, I find that the objective goes down systematically after a while. Every time the sampler makes a step, all the new proposals are of lower probability and it rejects most of them until one is just a bit lower. This goes on until the algorithm starts behaving more like a random walk again in a region of quite low probability... This seems very odd to me as the sampler should have an equal probability to choose to go back in the direction of higher probability it just came from (I'm using a spherical Gaussian to generate proposals). I'm quite sure the sampling algorithm is correct as it worked well for a variety of tasks where I didn't include the prior. Even if the prior would be flattening the joint distribution a lot, I would still expect random walk behavior and not the systematic decrease I find now. Could this be due to numerical issues, high dimensionality or is this possible for certain peculiar types of distributions? asked Dec 30 '10 at 13:33 Philemon Brakel 2445●10●35●60 1 Are you including the prior in the computation of the likelihood that you use for the metropolis step? I've had a similar bug in the past and it was due to inconsistent use of the priors. Does this still happen with less data? It sounds like something that shouldn't happen unless you're using a very bad proposal distribution that never proposes something in a good direction. Can you try stopping the simulation, proposing a better step and seeing how often it's accepted/its probability? (Dec 30 '10 at 20:33) Alexandre Passos ♦ Thanks for the help. I'm indeed including the prior in the computation of the likelihood that I use for the metropolis step by simply adding the log of it. Is this wrong somehow? The problem seems to be independent of the number of datapoints which I varied between 50 and 4000... I'm not really sure how to try your suggestion of using a better step manually. When I print the probabilities of accepting new steps, it is 1 when the new likelihood is higher and less than one when it is lower like it should be but somehow the proposals are almost always lower than the current value. They are also not that much lower but just a bit so they are easily accepted. After accepting one of these lower probability steps all the new proposals are immediately slightly lower than the value that has just been accepted. The acceptance rate seems to be lower in the beginning than later on. This seems odd as well because I'm more used to the opposite pattern (needing big steps for burn-in and smaller ones when sampling after convergence). (Dec 31 '10 at 06:30) Philemon Brakel That's interesting. So if the acceptance is right (1 with higher likelihood, smaller as the likelihood goes down), the problem can only be in your proposal distribution. Is it symmetrical? Is there a bug in computing it? (Dec 31 '10 at 06:33) Alexandre Passos ♦ The proposal distribution is a spherical Gaussian that is centered on the last accepted sample. I'm using a standard implementation from python's numpy for this. I printed separate scores for the data likelihood and the prior and noticed that the data likelihood always goes up until some sort of mode is reached while the prior likelihood always goes down very slowly. This causes the total likelihood to go up first but than go systematically down forever after the data likelihood maxed out. I'm starting to suspect the function that computes the prior likelihood. It seems to work well for two dimensions though and even if it was unstable I wouldn't expect it to be so systematic that Metropolis can't cope with it... I'll try some simpler (spherical) priors to see if that is where the problem comes from. You were talking about having a similar problem due to inconsistent priors and I start to think this might indeed also be what causes the strange results I'm finding now. (Dec 31 '10 at 09:05) Philemon Brakel

 toggle preview community wiki

### Subscription:

Tags:

×21
×4
×1

Asked: Dec 30 '10 at 13:33

Seen: 1,298 times

Last updated: Jan 02 '11 at 08:16

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.