3
1

The tutorial from Gunnar Martinsson on randomized methods for computing the SVD or the video tutorial.

Has anyone implemented any of these algorithms? In the talk it's been said that these are very very fast, with very little compromise on performance compared to SVD. The code below is very simple. I tested in on A of size 1000x5000 but don't get very similar results with the SVD, i.e. SVD is way better. There's some improvement from the RSVD though, so I don't think it's completely broken. What could be the problem?

def rsvd(A, k):

//Omega = random((N,k))

Omega = standard_normal((N,k))

Y = dot(A, Omega)

Q, R = qr(Y, econ = True)

B = dot(Q.T,A)

Uhat, Sigma, V = svd(B)

U = Q.dot(Uhat)

return U, Sigma, V

asked Nov 05 '10 at 02:31

Oliver%20Mitevski's gravatar image

Oliver Mitevski
857172944

edited Nov 06 '10 at 23:48

I have the same question here. I implement the randomized method for computing SVD based on the tutoiral from Gunnar Martinsson. My code is similar to code given above. However, the approximation result is very poor compared to the exact SVD result. The answers that follow this post does not answer why this happens. Is the theory incorrect? Can anyone explain why?

(Feb 11 at 15:57) Shawn

4 Answers:

Oliver, if you're interested in exploring the stochastic decomposition algorithm for yourself, you can get inspiration from ready implementations:

  1. The authors themselves implemented their algorithm in MATLAB.
  2. Daisuke Okanohara's redsvd. It is a clean, in-core implementation of top of the beautiful eigen3 C++ template library.
  3. For large-scale problems (streamed input, won't fit in RAM), have a look at gensim's implementation.

As for accuracy, it depends on the spectral properties of your problem. In Natural Language Processing, the spectrum declines very gradually (no sharp drop-off), so you'd need many power iterations for the stochastic algorithm to be reasonably accurate. Even massive oversampling (like 2*k) won't help you much.

Performance-wise, people just use standard Gaussian for the random matrix Omega -- at least I have never seen an implementation with the SRFT (subsampled random Fourier transform) suggested in section 4.6 of the original paper. My guess is the performance would actually suffer in practise, due to the complex memory access patterns.

answered Nov 06 '10 at 08:29

Radim's gravatar image

Radim
38679

edited Nov 06 '10 at 12:41

Thanks a lot for pointing my to these great packages redsvd and eigen3. It would be great if I could use them from python. The eigen3 python api is in pre-alpha state, but I hope it'll work for my needs. The gensim is already in python, which is great. I'll check the their matlab implementation to see why my code doesn't work as expected.

(Nov 06 '10 at 23:45) Oliver Mitevski

Can anyone give an detailed explanation of the Matlab code by the authors themselves? Their Matlab code takes a Power iteration approach which is slightly different from the algorithm titled as "PROTOTYPE FOR RANDOMIZED SVD" in their paper.

(Feb 15 at 20:23) Shawn

Have you had a look at this Python code. I am not sure it does exactly what you want, but it may be a source of inspiration.

answered Nov 24 '10 at 10:23

Gael%20Varoquaux's gravatar image

Gael Varoquaux
90641426

Thanks Gael, and especially to Alexandre for sharing that piece of code. I tested it and it works fine. That's exactly what I was looking for.

(Nov 25 '10 at 11:28) Oliver Mitevski
1

There's an updated version of it in scikits.learn that works with scipy.sparse matrices as well.

(Nov 25 '10 at 11:30) Alexandre Passos ♦

hm, couldn't find it in the scikits.learn package. where exactly is it? or is it under a different name?

(Nov 25 '10 at 12:19) Oliver Mitevski

It was commited a few days ago and I guess wasn't released yet. It's in scikits.learn.utils.extmath, as fast_svd

(Nov 25 '10 at 12:20) Alexandre Passos ♦

Which are these random numbers you're using? They should be gaussian with mean 0 and variance 1, or something similar. If they are 0-1 uniform random numbers the method gets a lot worse. Also, you should allow for some numerical errors by instead computing the truncated SVD of the first k+5 or so dimensions instead of k, and then truncating.

answered Nov 05 '10 at 06:42

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2525653274417

Thanks a lot Alexandre. I was using the uniform random function, so I changed that to standard_normal function which gives zero-mean unit-variance numbers, but did not get any better results. I really hoped that this was the bug, but will check their matlab implementation.

(Nov 06 '10 at 23:27) Oliver Mitevski

Suppose matrix A has rank k. The method gives by Gunnar Martinsson is supposed to give a fast and accurate rank-k approximation to the exact SVD decomposition of matrix A.

When you use Gunnar Martinsson's method to obtain an approximate SVD decomposition whose rank is lower than `k', the accuracy can become very bad. However, the computation becomes faster.

In summary, Gunnar Martinsson's method is supposed to be a fast approximation to the exact SVD, both of which has the same rank. You can verify this very easily by writing a program.

answered Feb 11 at 18:41

Shawn's gravatar image

Shawn
1

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.