|
The tutorial from Gunnar Martinsson on randomized methods for computing the SVD or the video tutorial. Has anyone implemented any of these algorithms? In the talk it's been said that these are very very fast, with very little compromise on performance compared to SVD. The code below is very simple. I tested in on A of size 1000x5000 but don't get very similar results with the SVD, i.e. SVD is way better. There's some improvement from the RSVD though, so I don't think it's completely broken. What could be the problem? def rsvd(A, k): //Omega = random((N,k)) Omega = standard_normal((N,k)) Y = dot(A, Omega) Q, R = qr(Y, econ = True) B = dot(Q.T,A) Uhat, Sigma, V = svd(B) U = Q.dot(Uhat) return U, Sigma, V |
|
Oliver, if you're interested in exploring the stochastic decomposition algorithm for yourself, you can get inspiration from ready implementations:
As for accuracy, it depends on the spectral properties of your problem. In Natural Language Processing, the spectrum declines very gradually (no sharp drop-off), so you'd need many power iterations for the stochastic algorithm to be reasonably accurate. Even massive oversampling (like Performance-wise, people just use standard Gaussian for the random matrix Thanks a lot for pointing my to these great packages redsvd and eigen3. It would be great if I could use them from python. The eigen3 python api is in pre-alpha state, but I hope it'll work for my needs. The gensim is already in python, which is great. I'll check the their matlab implementation to see why my code doesn't work as expected.
(Nov 06 '10 at 23:45)
Oliver Mitevski
Can anyone give an detailed explanation of the Matlab code by the authors themselves? Their Matlab code takes a Power iteration approach which is slightly different from the algorithm titled as "PROTOTYPE FOR RANDOMIZED SVD" in their paper.
(Feb 15 '14 at 20:23)
Shawn
|
|
Have you had a look at this Python code. I am not sure it does exactly what you want, but it may be a source of inspiration. Thanks Gael, and especially to Alexandre for sharing that piece of code. I tested it and it works fine. That's exactly what I was looking for.
(Nov 25 '10 at 11:28)
Oliver Mitevski
1
There's an updated version of it in scikits.learn that works with scipy.sparse matrices as well.
(Nov 25 '10 at 11:30)
Alexandre Passos ♦
hm, couldn't find it in the scikits.learn package. where exactly is it? or is it under a different name?
(Nov 25 '10 at 12:19)
Oliver Mitevski
It was commited a few days ago and I guess wasn't released yet. It's in scikits.learn.utils.extmath, as fast_svd
(Nov 25 '10 at 12:20)
Alexandre Passos ♦
|
|
Which are these random numbers you're using? They should be gaussian with mean 0 and variance 1, or something similar. If they are 0-1 uniform random numbers the method gets a lot worse. Also, you should allow for some numerical errors by instead computing the truncated SVD of the first k+5 or so dimensions instead of k, and then truncating. Thanks a lot Alexandre. I was using the uniform random function, so I changed that to standard_normal function which gives zero-mean unit-variance numbers, but did not get any better results. I really hoped that this was the bug, but will check their matlab implementation.
(Nov 06 '10 at 23:27)
Oliver Mitevski
|
|
Suppose matrix When you use Gunnar Martinsson's method to obtain an approximate SVD decomposition whose rank is lower than `k', the accuracy can become very bad. However, the computation becomes faster. In summary, Gunnar Martinsson's method is supposed to be a fast approximation to the exact SVD, both of which has the same rank. You can verify this very easily by writing a program. |
I have the same question here. I implement the randomized method for computing SVD based on the tutoiral from Gunnar Martinsson. My code is similar to code given above. However, the approximation result is very poor compared to the exact SVD result. The answers that follow this post does not answer why this happens. Is the theory incorrect? Can anyone explain why?