What I want to do is discover latent clusters in my data. I don't think it matters whether the factors are positive or not for this. In this case is there any advantage in doing NMF over plain old PCA? They're both linear, as far as I understand.

asked Feb 03 '14 at 21:41

digdug's gravatar image

digdug
245111620


One Answer:

Sounds like your data contains both positive and negative values, given that you "don't think it matters whether the factors are positive or not", then you should not use NMF, as it's trying to decompose a element-wise non-negative matrix, which means not only the factors being non-negative, the coefficients are non-negative as well.

In general, the "plain" PCA and NMF can both be interpreted as maximum likelihood estimates under certain likelihood model: PCA has the underlying Gaussian assumption, while NMF can have Gaussian/Poisson/Gamma assumption, depending on how the loss function is defined, with extra constrains that both factors and coefficients are nonnegative. Having this in mind may give you more insight on which one to choose. Nevertheless, it's pretty computationally cheap to try both on a reasonably big data set. So if you don't care about the details, just run both methods and see which one suits your particular application better.

answered Feb 04 '14 at 02:15

Dawen%20Liang's gravatar image

Dawen Liang
863

So it sounds like NMF is more suited for counts (which can't be negative), whereas PCA is more general?

(Feb 04 '14 at 04:32) digdug

Yes, count, or energy, anything non-negative by nature. NMF is good at decomposing stuff where you may have additive property, e.g. the data is from the (non-negative) weighted sums of factor 1 and factor 2.

PCA, on the other hand, is only changing the coordinate system into a new one which is formed by the eigenvectors of the covariance matrix.

(Feb 04 '14 at 12:57) Dawen Liang
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.