|
AS we use a DP for a mixture of distributions, we often end up with clusters that have very few elements, 1 or 2, over populations of 4000 elements. My usual approach is to disregard those clusters and assign the elements to its nearest cluster. Is there anything more formal to deal with these clusters? I'm guessing some kind of heuristics or model selection on the overall number of clusters. Thanks |
Samples from the Dirichlet process in general will look like that, with some number of very small clusters. This has to happen because of exchangeability, as if this only happened with very small probability then starting a new cluster would be very unlikely.
For practical purposes, though, yeah, getting rid of those might be worth it.