Revision history[back]
click to hide/show revision 1
Revision n. 1

Feb 28 '11 at 10:07

Oscar%20T%C3%A4ckstr%C3%B6m's gravatar image

Oscar Täckström
2024133450

My guess at point 1: GMM:s have many more parameters than soft k-means, since you also need to estimate co-variance matrices. Actually soft k-means is a special case of a GMM in which you assume a fixed and tied co-variance matrix. Because the likelihood function is more complex, naively initializing a GMM is more prone to get you stuck in bad local minima, compared to soft k-means.

Further, looking at the curves, the more clusters you use, the better the classification. I would guess that with that many clusters, modeling the shape of the clusters would be less important, since you can always capture parts of the distribution with high complexity by using lots of simple clusters.

I don't have any intuition to why hard k-means would perform so much worse than soft k-means.

click to hide/show revision 2
Revision n. 2

Feb 28 '11 at 10:10

Oscar%20T%C3%A4ckstr%C3%B6m's gravatar image

Oscar Täckström
2024133450

My guess at point 1: GMM:s have many more parameters than soft k-means, since you also need to estimate co-variance matrices. Actually soft k-means is a special case of a GMM in which you assume a fixed and tied co-variance matrix. Because the likelihood function is more complex, naively initializing a GMM is more prone to get you stuck in bad local minima, compared to soft k-means.

Further, looking at the curves, the more clusters you use, the better the classification. I would guess that with that many clusters, modeling the shape of the clusters would be less important, since you can always capture parts of the distribution with high complexity by using lots of simple clusters.clusters. Since you don't need to be able to interpret the results, there is not much loss in adding a couple of extra clusters rather than adequately modeling the shape of one cluster.

I don't have any intuition to why hard k-means would perform so much worse than soft k-means.

click to hide/show revision 3
Revision n. 3

Feb 28 '11 at 10:16

Oscar%20T%C3%A4ckstr%C3%B6m's gravatar image

Oscar Täckström
2024133450

My guess at point 1: GMM:s have many more parameters than soft k-means, since you also need to estimate co-variance matrices. Actually soft k-means is a special case of a GMM in which you assume a fixed and tied co-variance matrix. Because the likelihood function is more complex, naively initializing a GMM is more prone to get you stuck in bad local minima, maxima, compared to soft k-means.

Further, looking at the curves, the more clusters you use, the better the classification. I would guess that with that many clusters, modeling the shape of the clusters would be less important, since you can always capture parts of the distribution with high complexity by using lots of simple clusters. Since you don't need to be able to interpret the results, there is not much loss in adding a couple of extra clusters rather than adequately modeling the shape of one cluster.

I don't have any intuition to why hard k-means would perform so much worse than soft k-means.

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.