|
Is it fair to compare two clustering algorithms when the number of representatives is not the same in the result ? I know that there is some evaluation measures that can be used regardless of the number of representatives (clusters) like v-measure, but I'm asking here about comparing: the error rates (i.e. number of datapoints from the test dataset that were not correctly assigned), the recall and precision computed from a CxC confusion matrix. |
|
No. For these basic measures, as the numbers of clusters go to infinity your confusion matrices get much cleaner. So in this case I should compare the algorithms by using v-measure (completeness and homogeneity), right ?
(Mar 07 '12 at 11:05)
shn
That's only a partial solution to the problem. You should either avoid comparing across numbers of clusters or use a more downstream metric that is robust to the number of clusters (for kmeans, for example, you can look at the distance between points which were not in your training set and the closest clusters).
(Mar 07 '12 at 11:07)
Alexandre Passos ♦
Looking at "the distance between points from the test set and the closest clusters" is not as efficient as a supervised measure, is it ? But I think that there is no supervised evaluation measure that is robust to the number of clusters ...
(Mar 07 '12 at 11:47)
shn
You can use an adjusted for chance metric: Adjusted Rand Index or Adjusted Mutual Information: those metrics will give you an "agreement" between 2 clusterings without falling to the infinite number of cluster is better fallacy. The adjustment for chance means that comparing a clustering with a random clustering assignment will give you an expected value of 0.0 whatever the number of clusters used in both clusterings.
(Mar 07 '12 at 12:19)
ogrisel
@ogrisel I looked on the formula for computing the ARI and AMI measures (from Wikipedia), but it's a little confusing. Say I want to compare 4 clustering algorithms using this measures, can I compute them for each clustering algorithm separately, or I need to get the representatives produced all this 4 algorithm, and compare them two by two ? Given a test dataset and a clustering algorithm that produced a set of K representatives, is it possible to compute the ARI and AMI measures for this algorithm ? Can you provide a simple pseudo-code to compute ARI and AMI measures, because the formulas are a little confusing. Thanks.
(Mar 09 '12 at 03:54)
shn
@Shna I wrote up an AMI scorer in scala that you can take a look at. It's up as a Gist on GitHub here: https://gist.github.com/2007548. The input format for the clustering assignments is likely different from what you want, but the key method for you to use is the ami method. In order to use it, you just have to compute the contingency matrix between classes from one clustering algorithm against another algorithm. So element A(i)(j) records the number of data points assigned to cluster C_i by algorithm X and C_j by algorithm Y. And A does not need to be square, so it's designed to work when two algorithms create different numbers of clusters. There's only one thing to note, If either algorithm X or Y create one cluster, the AMI will be 0.
(Mar 09 '12 at 12:08)
Keith Stevens
@KeithStevens @ogrisel I'll take a look at AMI, but another question would be: is it fair if I re-cluster the final representatives produced by each algorithm to the same number of centers (K) using K-means, before comparing them with F-measure etc ?
(Mar 14 '12 at 09:21)
shn
showing 5 of 7
show all
|