|
Hi, I just finished to watch Joseph's Turian talk about deep belief networks at bil conference: http://www.bilconference.com/videos/deep-learning-artificial-intelligence-joseph-turian/ and I have a question regarding the 2nd trick which is:
From what I have read Geoffrey Hinton uses RBMs to do one layer at time training but I don't know yet about Boltzman Machines. I have a well working back propagation algorithm code implemented on a multi-GPU system and I was wondering if I could use it to train deep belief network training features one layer at a time exactly as explained in the video? Seems like it is similar to training auto-associative network, just sliding it as layer level increases. I would train N layers of features, and at the end I would add a small fully connected 3 layer network so it could decide which features to use for corresponding output. Will this technique work? Or these principles only apply to boltzman machines? Thank you very much in advance. |
|
Backpropagation can be used without modifications to train both the unsupervised and supervised phases of stacked denoising autoencoders. The Learning deep architectures for AI book-length paper by Yoshua Bengio explains how this can be done. You can find example code in the theano deep learning tutorial. 1
Indeed. Stacked Denoising Autoencoders share some of the nice properties of the deep belief networks. The major difference is that there is no natural way to sample from the model: it more useful as an (un-)(semi-)supervised pre-trained discriminative model while DBNs can be used both generatively and discriminatively.
(Nov 28 '10 at 12:07)
ogrisel
DBN's can be 'fine-tuned' by using backpropagation as well. The RBM's are generally trained using an algorithm called contrastive divergence (or one of its variants) one at the time. Every layer is trained on the output of the previously trained module. At the very end you can choose to train the system discriminatively by adding a layer of classification/regression nodes and interpreting it as a standard multilayer perceptron while doing backpropagation. When using autoencoders you want to be sure they are not just copying the input by choosing smart starting weights regularization or applying some form of noise on the inputs.
(Nov 28 '10 at 16:11)
Philemon Brakel
|