|
hiall, i'm a newbie in LDA, maybe this question is very stupid, but as my experiments, LDA is usually much slower than PLSA... i can't think through whether it is worth researching? many 3ks! |
|
The replicated softmax is an undirected topic model that dramatically outperforms LDA in terms of log prob and allows fast deterministic inference using a single forward pass. Since the probabilistic model is a lot better and the inference is easy, you should consider it. Depending on the requirement of your application, it may or may not be appropriate. For instance, it does not yield easily-interpretable topics by itself (you would have to put LDA or something on top of it for that), but if you care more about having a good model and less about reading topic tea leaves, it might be useful. |
|
More generally, researching any topic even if it isn't a computationally feasible method, isn't necessarily a bad idea. 20-25 years ago, people were researching algorithms like the Boltzmann Machine -- an algorithm that, while powerful, wasn't computationally feasible on hardware generally available at the time particularly with the training algorithms that were known. Yet, some people studied/researched them and were able to build on the idea. At some point it became a viable algorithm.
This answer is marked "community wiki".
|
|
The speed of LDA depends on the implementation. I think the implementation in vowpal-wabbit, for example, should be faster than most implementations of PLSA. LDA is more interestin g as a building block than as a thing in itself. In this it is not so different from PLSA (it is PLSA with priors), but more useful in a larger context. See for example this list of interesting papers that are based on and use LDA. yep, but this paper seems to use variational method other than sampling methods, it's said that, variational method is less accurate and local minimal than sampling methods. Is variational method the only choise to gain speed?
(Apr 06 '11 at 01:58)
ylqfp
I dont understand what part of the LDA is slow? The building part is a bit slow yes, but the inference is freakishly fast if you have a in memory model. I exposed a 500 topic wikipedia model as a web service in node.js and it was very fast. I feel another great advantage is that if new docs are added it is easy to update the model. Here is the benchmark I performed. http://mnesia.wikispaces.com/Organixe
(May 09 '11 at 04:56)
kpx
|