Revision history[back]
click to hide/show revision 1
Revision n. 1

Jun 30 '10 at 18:07

Ian%20Goodfellow's gravatar image

Ian Goodfellow
1072142634

How do you choose layers to stack on?

This applies to any kind of deep network model where you're using layer by layer pretraining for some kind of MLP. Each layer has multiple hyperparameters, but measures like validation set classification performance are only available if you've constructed the whole model. If you want to try k different hyperparameter settings per layer and the network has depth D, then you would end up having to train k^D networks, which is far too expensive. I can think of a few ways around this, but I'm curious which other people are using in practice: -Randomly sample k sets of hyperparameters for the whole network, and train those k networks. -Randomly sample k sets of hyperparameters for layer N, train layer N, and before building layer N+1, pick the best 1 set of hyperparameters for layer N. Criteria for choosing the best could be validation set error using an MLP trained off of only the layers used so far, or computing some measure of the invariance properties of the features learned by the layer.

click to hide/show revision 2
Revision n. 2

Jul 05 '10 at 16:19

Joseph%20Turian's gravatar image

Joseph Turian
561050122143

How do you choose layers to stack on?

This applies to any kind of deep network model where you're using layer by layer pretraining for some kind of MLP. Each layer has multiple hyperparameters, but measures like validation set classification performance are only available if you've constructed the whole model. If you want to try k different hyperparameter settings per layer and the network has depth D, then you would end up having to train k^D networks, which is far too expensive. I can think of a few ways around this, but I'm curious which other people are using in practice: -Randomly sample k sets of hyperparameters for the whole network, and train those k networks. -Randomly sample k sets of hyperparameters for layer N, train layer N, and before building layer N+1, pick the best 1 set of hyperparameters for layer N. Criteria for choosing the best could be validation set error using an MLP trained off of only the layers used so far, or computing some measure of the invariance properties of the features learned by the layer.

click to hide/show revision 3
Revision n. 3

Jul 05 '10 at 16:20

Joseph%20Turian's gravatar image

Joseph Turian
561050122143

How do you choose layers to stack on?

This applies to any kind of deep network model where you're using layer by layer pretraining for some kind of MLP. Each layer has multiple hyperparameters, but measures like validation set classification performance are only available if you've constructed the whole model. If you want to try k different hyperparameter settings per layer and the network has depth D, then you would end up having to train k^D networks, which is far too expensive. I can think of a few ways around this, but I'm curious which other people are using in practice: -Randomly practice:

  • Randomly sample k sets of hyperparameters for the whole network, and train those k networks. -Randomly networks.
  • Randomly sample k sets of hyperparameters for layer N, train layer N, and before building layer N+1, pick the best 1 set of hyperparameters for layer N. Criteria for choosing the best could be validation set error using an MLP trained off of only the layers used so far, or computing some measure of the invariance properties of the features learned by the layer.

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.