|
I'm interested in implementing Ruslan Salakhutdinov + Geoffrey Hinton's RBM based topic modeling method described here: http://www.mit.edu/~rsalakhu/papers/repsoft.pdf Has anyone tried implementing this? Is it possible to distribute the algo across processes/machines?? Thanks, Timmy Wilson |
|
I am this student - here it is: http://www.fylance.de/rsm/ I am interested in this too. Thank you.
(Nov 11 '11 at 04:54)
Visarga
|
|
Excellent paper, I too have an interest in this. If only we could learn from millions of non labeled documents - the internet is full of text - and then improve classification, clustering and other text tasks. |
|
I have started to implement it, but it will not be trivial to parallelize across many machines in a cluster setting. For large vocabularies, it could be parallelized pretty simply with multiple processes on a SMP machine by splitting up the weights for a small degree of parallelization. For small vocabularies (10,000 or so), it can probably run quite quickly with a straightforward GPU implementation. I don't currently have code finished that I can release, but you might try contacting Dr. Salakhutdinov and asking if he will make matlab code available at any point. |
A student of mine has a python implementation -- more or less reproducing the paper's result. Drop me a line if you are interested, I'll arrange the rest.
Yes -- definitely! python is preferred-- Thanks osdf