2
3

Hi All,

I’m interesting to apply deep learning architectures for source code author identification problem. I thought using an existing deep learning library would be easy and reliable than wring it from the scratch. So can somebody help me to find out a good deep learning library?

EDIT: I'm planning to use stacked auto-encoder for this application.

Thanks

asked Feb 29 '12 at 03:35

Upul%20Bandara's gravatar image

Upul Bandara
1047912

edited Feb 29 '12 at 23:16

Unfortunately (or fortunately depending on your point of view) deep learning comprises a large collection of methods, algorithms, and models. Therefore, there isn't a single "deep learning" library. If you know what particular models and algorithms you want, you should put them into your question.

(Feb 29 '12 at 19:26) gdahl ♦

@gdahi I updated my question.

(Feb 29 '12 at 23:17) Upul Bandara

3 Answers:

At some point in the next few weeks, I plan to release my MATLAB code to train various kinds of RBMs and DBNs. I have to make the necessary modifications to change it from "usable by me" to "usable by everyone" and then it's good to go.

answered Feb 29 '12 at 08:19

Nicolas%20Le%20Roux's gravatar image

Nicolas Le Roux
7652912

edited Feb 29 '12 at 08:19

Theano is a python library that is optimized for speed and easy to use. There is a large body of tutorials based on theano. The websites seem to be down atm, though :-/

This answer is marked "community wiki".

answered Feb 29 '12 at 03:46

Andreas%20Mueller's gravatar image

Andreas Mueller
2686185893

2

They seem to be having DNS issues, this URL still works in the meantime: http://132.204.24.80/tutorial/ http://132.204.24.80/software/theano/

I guess it's also worth mentioning that Theano is actually a general purpose 'math expression compiler', and not so much a 'deep learning library', which just happens to be used for deep learning a lot because that is what its authors mainly use it for. Theano does not give you a 'DeepBeliefNetwork' class that does everything you need, but rather the means to implement it yourself without too much hassle.

There's also pylearn2: http://132.204.24.80/software/pylearn2/. I haven't used it, but it has some Theano-based implementations of deep learning models and algorithms, so maybe it's worth looking into.

(Feb 29 '12 at 06:04) Sander Dieleman
1

I forgot to mention that my own Theano-based RBM implementation is also on Github. It's specifically aimed at being able to try out different unit types or combinations of unit types, different parameterisations and different training strategies without too much overhead. For now it's only for RBMs though, there is no code to facilitate stacking them into DBNs, although I do plan to write it.

(Feb 29 '12 at 08:53) Sander Dieleman
1

If you want to train convolutional neural nets on NVIDIA GPUs, Alex Krizhevsky's software is quite good. http://code.google.com/p/cuda-convnet/ From what my colleague says, it is a factor of 10 times faster than the theano implementation of the same model he made. Hopefully future versions of theano will reach this level of performance.

(Feb 29 '12 at 19:21) gdahl ♦

There's also Torch7 (code). It uses Lua instead of Python, but if you know one, the other is incredibly easy to learn. It's speeds are supposed to be on-par with Theano's. It can automatically use CUDA as well as OpenMP (though I haven't tried either out yet).

Which library you end up using most likely depends on your personal taste. However, I can say what attracted to me to Torch is:

  1. While I really like Python, I never loved the indentation issue. Lua uses the more traditional 'end'. This means refactoring code is much easier.
  2. Torch uses the module-scheme for specifying the network's architecture (PyBrain is similar in this regard). After rewriting my code many times over, I've come to appreciate the simplicity of this style. Not only does it keep your code cleaner, but it's much easier to debug.

Of course, it's not nearly as mature as the SciPy community. Therefore, there are a ton of libraries that aren't available. If you are interested in stacking AutoEncoders or RBMs, I think it should fit your needs. You can always go back to Python or Matlab for plotting.

answered Mar 02 '12 at 10:48

nop's gravatar image

nop
2414712

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.