Revision history[back]

I think the only reasonable approach is 4. 1, as you mentioned, would be too slow. For 2 and 3 you have no guarantee of convergence, or even that the approach is sensible. There are many sampling methods, and a good start to know some of their properties and when can you use them is the wikipedia page for Monte Carlo integration. Regardless of the specific method you should be able to bound the variance of the monte carlo estimate of the expected value you're after, and this is your main way of making sure that you have the right value.

Which approach to use depends a lot on what you know about the probability distribution over which you want to compute expectations. From strongest to weakest assumptions:

1. If you can sample from it in a reasonable amount of time, you can just use the empirical mean of the function applied to the samples as your estimate, and use the known formulas for the variance of this estimator to bound things until your confidence interval is narrow enough.

2. If you can't sample from the distribution but you can evaluate the (possibly unnormalized) likelihood of any given point and sample from a reasonably similar helper distribution of it you can use importance sampling, which is essentially the method I specified above but with the samples from the helper distribution weighted by their likelihood according to the actual distribution. This has some failure modes, but precise variance bounds can be stablished that will help you know when to stop sampling.

3. If you can only evaluate the (possibly unnormalized) likelihood of any given point but wouldn't dare claiming that any known distribution is a good approximation of the one you're after, you can use Markov chain Monte Carlo (MCMC) techniques to create a markov chain such that its stationary distribution is the one you're after; hence you can get samples from the distribution you want to sample from and use them to compute expectations. There are many methods to do this.

A good introductory reference on this subject is chapters 29 and 30 of David MacKay's Information Theory, Inference, and Learning Algorithms book, which you can find for free on the internet. You can download individual chapters here.

I think the only reasonable approach is 4. 1, as you mentioned, would be too slow. For 2 and 3 you have no guarantee of convergence, or even that the approach is sensible. There are many sampling methods, and a good start to know some of their properties and when can you use them is the wikipedia page for Monte Carlo integration. Regardless of the specific method you should be able to bound the variance of the monte carlo estimate of the expected value you're after, and this is your main way of making sure that you have the right value.

Which approach to use depends a lot on what you know about the probability distribution over which you want to compute expectations. From strongest to weakest assumptions:

1. If you can sample from it in a reasonable amount of time, you can just use the empirical mean of the function applied to the samples as your estimate, and use the known formulas for the variance of this estimator to bound things until your confidence interval is narrow enough.

2. If you can't sample from the distribution but you can evaluate the (possibly unnormalized) likelihood of any given point and sample from a reasonably similar helper distribution of it you can use importance sampling, which is essentially the method I specified above but with the samples from the helper distribution weighted by the ratio of their likelihood according to the actual distribution and their likelihood in the sampling distribution. This has some failure modes, but precise variance bounds can be stablished that will help you know when to stop sampling.

3. If you can only evaluate the (possibly unnormalized) likelihood of any given point but wouldn't dare claiming that any known distribution is a good approximation of the one you're after, you can use Markov chain Monte Carlo (MCMC) techniques to create a markov chain such that its stationary distribution is the one you're after; hence you can get samples from the distribution you want to sample from and use them to compute expectations. There are many methods to do this.

A good introductory reference on this subject is chapters 29 and 30 of David MacKay's Information Theory, Inference, and Learning Algorithms book, which you can find for free on the internet. You can download individual chapters here.

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.