|
I want to calculate the Gramian Matrix, the matrix of inner products of a set of vectors. Is there a good algorithm to do this when the vectors are dense that's faster than O(n^3)? Is there a good algorithm to do this when the vectors are sparse that's faster than O(n^2m) (where m is average number of non-zero terms in all vectors)? Alternately, is there a fast way to cluster these vectors more directly, or to reduce their dimensionality? |
|
Maybe you can use the SVD of the matrix containing the vectors (as columns or rows). If you have n vectors and you put them as columns you get an m x n matrix A and you want their outer product G = A'A then by using the SVD decomposition of A, A=USV', G = V S^2 V' (You don't need the left singular vectors U and you can possibly request from your library not to be computed, as is the case with R) Of course this ultimately requires the multiplication of V (n x r), S (r x r), and V' (r x n), where r is the rank of A (the number of linearly independent vectors you started with) but the SVD decomposition could be additionally used to cluster the vectors and reduce their dimensionality through PCA (i.e. instead of the r singluar values, choose the first two or three). Computing the SVD will be slower than computing AA'. They both have complexity of O(mn^2)
(Mar 28 '12 at 21:43)
Mark Borgerding
Except you can truncate the SVD computation with some randomized algorithms and make it faster, by implicitly clustering the data: http://www.stanford.edu/group/mmds/slides2010/Martinsson.pdf
(Mar 29 '12 at 09:11)
Alexandre Passos ♦
|
If you arrange vectors as rows of V, then V V' is the gramian matrix, so you can use fast algorithms for dense/sparse matrix multiplication
This approach is probably great, since most libraries implement matrix multiplication way faster than I do. thanks!