Hello to everyone

I'm new at using the Mean Shift Clustering (MSC) algorithm. I am running the Matlab implementation available at http://www.mathworks.com/matlabcentral/fileexchange/10161-mean-shift-clustering

I am currently clustering 2000 objects on a 2-dimensional feature space. The thing is that depending on the value of the bandwith parameter, I encounter convergence problems. In fact, if I increase the bandwidth, the algorithm gets stuck and (apparently) would keep on running forever.

Regardless of the implementation, my question is whether MSC is prone to this type of convergence problems. Any comments are welcome.

Xavier

asked Oct 31 '11 at 11:08

Xavier's gravatar image

Xavier
1112

Are you sure it's a convergence issue? afaik mean shift is guaranteed to converge. Your problem is very small so there should really be no problem. I would expect that if you set the bandwidth way to high, you'd just get one big cluster centered on the mean.

(Oct 31 '11 at 11:36) Andreas Mueller

Hello Andreas

thanks for your quick response.

I agree with you that MSC should have no problems in converging for my problem. But I don't know what other thing other than a convergence issue could be happening. Let me give you some insight into my problem.

The two dimensions of my feature space are the latitude and longitude of 2000 locations worldwide, and I my intention is to create clusters of a certain extension (the MSC's bandwidth).

If I cluster them using a 150 kilometers bandwidth, MSC runs in around 30 seconds, and I obtain what I expect.

However, if I use a 200 km bandwidth instead (there shouldn't be a big difference in the clustering results, as my locations are spread worldwide), the algorithm keeps running for hours. I stopped it when it had been running for 11 hours (!). I've tried it more than once and I always get the same behaviour.

I am not able to see why such a small variation in the bandwidth value can lead to such a big change in performance. Maybe it has to do with the implementation I use, I don't know, but before blaming it I wanted to know whether MSC can show this type of behaviour.

Thanks!

Xavier

(Oct 31 '11 at 13:20) Xavier

One Answer:

Try printing norm(myMean-myOldMean) every iteration to see the convergence progress .

By the way, Meanshift implementation that you are using is a brute-force one, and it is not very efficient. It could be that it is just taking too long to evaluate. AFAIK gradient descent based meanshift implementations are much faster.

answered Nov 01 '11 at 14:48

Dmitry%20Chichkov's gravatar image

Dmitry Chichkov
1653614

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.