It is really hard to implement the gradients manually for LSTM.

So I'm looking for either: 1) a paper with clear derivative formulas 2) a ready to use package to ease the pain of gradients handcrafting

I have tried Theano but it takes surprisingly long to precompute gradients and I found no tricks to cache the result of the gradients computation.

Thanks!

asked Dec 20 '13 at 03:58

Konstantin's gravatar image

Konstantin
34181218


2 Answers:

I don't see what takes long with the LSTM gradients. I used it quite sucessfully a while back.

That being said, the Phd thesis of Alex Graves has them all, I think. Also, PyBrain has code here:

answered Dec 23 '13 at 15:59

Justin%20Bayer's gravatar image

Justin Bayer
170693045

It should be noted that Alex Grave's thesis uses BPTT rather than the original LSTM algorithm that's based on RTRL.

(Dec 28 '13 at 15:26) Max

You can find an tutorial how to implement lstm with theano: here

answered Dec 18 '14 at 13:29

Christian%20H's gravatar image

Christian H
161

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.