|
I wish to perform a fast (possibly approximate) soft-assign of high-dimensional data points to a large Gaussian mixture with diagonal covariance (this is related to the E step of the EM algorithm). Assume that the mixture is given by ($w_k$, $mu_k$, $sigma_k$), that the number of mixture components is K (say K=1000) and that the data dimensionality is D (say D=100). Then, for each input pattern x we wish to recover the soft-assigns p_k(i|x), k=1,...,K_1 to the top K_1 components responsible for generating x (assume K_1 << K, say K_1=10). If the number of patterns is n (say n=100000) the complexity of the straight-forward approach is O(nKD). Is there some known work on reducing this complexity by e.g. exploiting the structure of the Gaussian mixture? |