We are working with some sparse Matrices, and with the help of some very smart people we already figured that a Mixture of Gaussians apparently is not the best model to fit a dataset with Sparse features.

Does anyone has a good insight on the reasons for this, I found a paper where they use some constrains in the covariances so they can model sparse features.

Regards

asked Nov 20 '12 at 04:37

Leon%20Palafox's gravatar image

Leon Palafox ♦
40857194128


3 Answers:

This is speculation, but I assume that sparse data is likely to have a more peaked distribution than a Gaussian because there are so many zeros. It might look more like a Laplace distribution.. To approximate a distribution that has excess kurtosis you would need to start with a Gaussian with very small variance and add other Gaussians with the same mean that have lower variances in a stepwise fashion. This is not very efficient because the number of Gaussians you actually need might be very large. You could try to fit a Student-t or Laplace distribution to see if it explains the data better than a Gaussian. Of course these models are less straightforward to estimate.

I don't know about the constrained covariances and that paper sounds interesting.

answered Nov 20 '12 at 07:10

Philemon%20Brakel's gravatar image

Philemon Brakel
2445103560

In this paper, in the related work section http://www.cs.cmu.edu/~akshaykr/files/sgmm_paper.pdf

In that thesis, they focus on restricting the covariances (because they will be non-singular) link:http://www.cs.colostate.edu/~dane/papers/thesis.pdf

(Nov 20 '12 at 22:58) Leon Palafox ♦

The issue is not the covariance between the features. The issue is that the individual features are not Gaussian distributed because their PDF has a big spike at 0. You should read about spike-and-slab distributions, which can by constructed by taking the product between a Bernoulli variable and a Gaussian variable. When the Bernoulli is 0, the product of the two is 0, and this captures the sparsity of the data. It sounds like what you want to do is make a mixture of spike-and-slab model, instead of a mixture of Gaussians model.

answered Nov 21 '12 at 11:37

Ian%20Goodfellow's gravatar image

Ian Goodfellow
1072162734

If your data looks like count data, I recommend checking out this paper:

On Tensors, Sparsity, and Nonnegative Factorizations

answered Nov 26 '12 at 13:44

Art%20Munson's gravatar image

Art Munson
64611316

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.