|
I was wondering if anyone knew of any research working on "fitting q to p" when p is known up to a constant. More concretely, I would like to find min_{theta} KL(p(x)||q(x|theta)) with the ultimate goal of generating samples from q(x|theta) for importance sampling against p(x). Ideally, the method would be online instead of batch. After a little Googling, I've found that this is the problem of Density Estimation, though most related literature I've found speaks about histograms and Kernel Densities, but my goal is to be doing this many times instead of optimizing once and for all. Does anyone know of something that sounds like this? |
Are you talking about KL(q||p) or KL(p||q)? The title says KL(p||q), but the body says KL(q||p) but suggests that you really want KL(p||q) (to importance-sample and not have a very high rejection rate you want all the mass of p to be reasonably covered by q, which implies low KL(p||q) but not necessarily low KL(q||p), which would ensure samples from q have reasonably high probability according to p, which would lead to most of the mass of the importance samples to be placed on the outliers, having a very high variance).
Also, what do you mean by online in this setting? You mention having p up to a constant, so there's no need to look at data points and whatnot. Do you mean optimizing q with only one pass over theta?
Thanks for noticing my mistake; KL(p||q) is the goal.
I mention "online" only because the actual goal is to run a particle filter, and I would like to "improve" q(x|theta) as time goes on. Take for example a child learning to play a video game. At first he may be unable to predict how the game AI may react, but over time he will improve his ability to guess. In the meantime, he must still play the game (and thus incur a penalty for bad choices). Being able to "learn" a distribution like this would be ideal.