|
Are there any efficient internal evaluation measures for clustering (without ground truth) that can be computed in an incremental way each time some new data are clustered, i.e. without needing to redo the entire computation/evaluation to take into account the new clustered data. Can we deduce such a measure from some existing internal evaluation measures ? |
|
I think it depends on the clustering algorithm you're using. Some natural ideas are: (1) sum, for the K most recent points, before adding them to the clustering, of their distances to the closest cluster; (2) over the K most recent points, before adding them to the clustering, the max of their distance to the closest cluster, etc. If you are more specific about which clustering model you're using I'm sure there will be some natural quality measure. using a simple online kmeans where the number of centers is not fix (may increase as new data are clustred)
(Mar 02 at 10:55)
Shna
|