Are there any efficient internal evaluation measures for clustering (without ground truth) that can be computed in an incremental way each time some new data are clustered, i.e. without needing to redo the entire computation/evaluation to take into account the new clustered data. Can we deduce such a measure from some existing internal evaluation measures ?

asked Feb 04 at 17:03

Shna's gravatar image

Shna
284162029


One Answer:

I think it depends on the clustering algorithm you're using. Some natural ideas are: (1) sum, for the K most recent points, before adding them to the clustering, of their distances to the closest cluster; (2) over the K most recent points, before adding them to the clustering, the max of their distance to the closest cluster, etc. If you are more specific about which clustering model you're using I'm sure there will be some natural quality measure.

answered Feb 04 at 18:06

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1899744214335

using a simple online kmeans where the number of centers is not fix (may increase as new data are clustred)

(Mar 02 at 10:55) Shna
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.