hiall, i'm a newbie in LDA, maybe this question is very stupid, but as my experiments, LDA is usually much slower than PLSA... i can't think through whether it is worth researching?

many 3ks!

asked Apr 03 '11 at 21:51

ylqfp's gravatar image

ylqfp
0334

edited Apr 03 '11 at 21:53


3 Answers:

The replicated softmax is an undirected topic model that dramatically outperforms LDA in terms of log prob and allows fast deterministic inference using a single forward pass. Since the probabilistic model is a lot better and the inference is easy, you should consider it. Depending on the requirement of your application, it may or may not be appropriate. For instance, it does not yield easily-interpretable topics by itself (you would have to put LDA or something on top of it for that), but if you care more about having a good model and less about reading topic tea leaves, it might be useful.

answered May 07 '11 at 15:46

gdahl's gravatar image

gdahl ♦
341453559

More generally, researching any topic even if it isn't a computationally feasible method, isn't necessarily a bad idea. 20-25 years ago, people were researching algorithms like the Boltzmann Machine -- an algorithm that, while powerful, wasn't computationally feasible on hardware generally available at the time particularly with the training algorithms that were known.

Yet, some people studied/researched them and were able to build on the idea. At some point it became a viable algorithm.

This answer is marked "community wiki".

answered Apr 06 '11 at 14:24

Brian%20Vandenberg's gravatar image

Brian Vandenberg
824213746

The speed of LDA depends on the implementation. I think the implementation in vowpal-wabbit, for example, should be faster than most implementations of PLSA.

LDA is more interestin g as a building block than as a thing in itself. In this it is not so different from PLSA (it is PLSA with priors), but more useful in a larger context. See for example this list of interesting papers that are based on and use LDA.

answered Apr 03 '11 at 22:05

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

yep, but this paper seems to use variational method other than sampling methods, it's said that, variational method is less accurate and local minimal than sampling methods. Is variational method the only choise to gain speed?

(Apr 06 '11 at 01:58) ylqfp

I dont understand what part of the LDA is slow? The building part is a bit slow yes, but the inference is freakishly fast if you have a in memory model. I exposed a 500 topic wikipedia model as a web service in node.js and it was very fast. I feel another great advantage is that if new docs are added it is easy to update the model. Here is the benchmark I performed. http://mnesia.wikispaces.com/Organixe

(May 09 '11 at 04:56) kpx
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.