0
1

I'm looking for a paper that could help in giving a guideline on how to choose the hyperparameters of a deep architecture, like stacked auto-encoders or deep believe networks. There are a lot of hyperparameters and I'm very confused on how to choose them. Also using cross-validation is not an option since training really takes a lot of time!

asked Apr 28 '14 at 08:49

Alex%20Twain's gravatar image

Alex Twain
618912


One Answer:

James Bergstra's work is a good place to start on this topic:

http://www.eng.uwaterloo.ca/~jbergstr/research.html#modelsearch

http://jaberg.github.io/hyperopt/

If you have too much training data to be able to explore many different configuration of hyper-parameters, try using a random sample from your training data for the hyper-parameter search. Once you have found a good set of hyper-parameters, use them to train on the full set of training data. Hopefully, the hyper-parameters that gave the best results on a random sample will also give good results with the full training set.

answered Apr 29 '14 at 18:28

bandini's gravatar image

bandini
112

edited Apr 29 '14 at 18:30

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.