|
It is really hard to implement the gradients manually for LSTM. So I'm looking for either: 1) a paper with clear derivative formulas 2) a ready to use package to ease the pain of gradients handcrafting I have tried Theano but it takes surprisingly long to precompute gradients and I found no tricks to cache the result of the gradients computation. Thanks! |
|
I don't see what takes long with the LSTM gradients. I used it quite sucessfully a while back. That being said, the Phd thesis of Alex Graves has them all, I think. Also, PyBrain has code here: It should be noted that Alex Grave's thesis uses BPTT rather than the original LSTM algorithm that's based on RTRL.
(Dec 28 '13 at 15:26)
Max
|