Revision history[back]
click to hide/show revision 1
Revision n. 1

Feb 27 '11 at 14:31

Oleg%20Trott's gravatar image

Oleg Trott
24681016

Why does K-means generate such good features, especially compared to GMM?

A paper by Coates, Lee & Ng presented at NIPS2010 compared different machine learning approaches within the same convolution-based image classification framework.

I think to many people's surprise they found that, if whitening is applied to the data, K-means(tri), a "soft" K-means variant, generates better features for classification than such sophisticated approaches as deep autoencoders and stacked RBMs.

figure3

I couldn't help noticing that K-means (both "hard" and "soft") did better than Gaussian Mixture Models (GMMs), even though, computational efficiency aside, GMMs are often thought of as a "better", more robust, K-means.

GMMs are actually quite similar to "soft" K-means, because cluster membership is "fuzzy", i.e. a matter of degree. So it is especially strange that, with whitening, GMMs produced the worst, while "soft" K-means produced the best results.

Questions:

  1. Why would GMMs do so poorly compared to K-means?

  2. Has anyone tried reproducing these results? (I hope I'm not being rude to the authors, but bugs happen, and the more surprising the results, the higher the posterior probability of a mistake. Say, a poor GMM implementation could explain some of my surprise)

  3. Has anyone tried applying K-means to other problems where deep learning ruled so far, like speech recognition?

  4. I wonder what happens if you stack several layers of whitening + kmeans(tri)? It seems like it would be a natural thing for Ng's group to try, but this paper doesn't mention it.

click to hide/show revision 2
Revision n. 2

Mar 01 '11 at 21:47

Oleg%20Trott's gravatar image

Oleg Trott
24681016

Why does K-means generate such good features, especially compared to GMM?

A paper by Coates, Lee & Ng presented at NIPS2010 compared different machine learning approaches within the same convolution-based image classification framework.

I think to many people's surprise they found that, if whitening is applied to the data, K-means(tri), a "soft" K-means variant, generates better features for classification than such sophisticated approaches as deep autoencoders and stacked RBMs.

figure3

I couldn't help noticing that K-means (both "hard" and "soft") did better than Gaussian Mixture Models (GMMs), even though, computational efficiency aside, GMMs are often thought of as a "better", more robust, K-means.

GMMs are actually quite similar to "soft" K-means, because cluster membership is "fuzzy", i.e. a matter of degree. So it is especially strange that, with whitening, GMMs produced the worst, while "soft" K-means produced the best results.

Questions:

  1. Why would GMMs do so poorly compared to K-means?

  2. Has anyone tried reproducing these results? (I hope I'm not being rude to the authors, but bugs happen, and the more surprising the results, the higher the posterior probability of a mistake. Say, a poor GMM implementation could explain some of my surprise)

  3. Has anyone tried applying K-means to other problems where deep learning ruled so far, like speech recognition?

  4. I wonder what happens if you stack several layers of whitening + kmeans(tri)? It seems like it would be a natural thing for Ng's group to try, but this paper doesn't mention it.

Edit: to clarify the nomenclature:

  • "soft" K-means == kmeans (tri)
  • "hard" K-means == kmeans (hard)

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.