So my data looks like this:
1a, 1a, 1a, 1b, 1b, 1c
2a, 2a, 2a, 2b, 2c
where I know the 'ground truth' class number, but not the class letter
(or how many letters there are for each class)

I would like to incorporate the ground truth information I have into my clustering performance analysis.

The only metric I know I can use currently is cluster Homogeneity, as that ignores other clusters (assuming other subclasses cluster separately)

Could anyone suggest either a workaround so that I can use ARI/NMI/completeness/etc.
Or another metric?

asked Oct 14 at 13:19

Josh%20Box's gravatar image

Josh Box
1111

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.