|
Does anybody have compared these two methods for sparsity purpose ? I have designed two models and want to compare the L1 regularization and the Variational Bayes with dirichlet distribution . but, I don't very clear the difference between these two methods , Is there some paper explain this ? |
|
Strictly speaking, Bayesian methods will not give you sparsity unless you put nonzero prior mass on the event that your parameter equals zero (which the Dirichlet distribution does not do), or if you use MAP estimation (which isn't really Bayesian). The two methods you describe will, in general, give different results, but it's impossible to say which will perform better without knowing anything about your model or your data. However, I don't really understand why you would expect any sparsity from a variational Bayes method using Dirichlet distributions. I read the paper "Bayesian Learning of Non-compositional Phrases with Synchronous Parsing" , in this paper,the author use the Variational Bayesian and dirichlet prior distribution.
(Feb 26 '12 at 08:24)
lizhonghua
|