|
All or nearly all of the papers using dropout are using it for supervised learning. It seems that it could just as easily be used to regularize deep autoencoders, RBMs and DBNs. So why isn't dropout used in unsupervised learning? |
|
All or nearly all of the papers using dropout are using it for supervised learning. It seems that it could just as easily be used to regularize deep autoencoders, RBMs and DBNs. So why isn't dropout used in unsupervised learning? |
Once you sign in you will be able to subscribe for any updates here
Tags:
Asked: Oct 29 '13 at 12:45
Seen: 1,215 times
Last updated: Oct 31 '13 at 04:43
I don't see any reason why dropout couldn't be used with these models. I suspect the reason we're mainly seeing the technique being used in a supervised context is because some of these recent innovations (dropout, rectified linear units, Nesterov accelerated gradient etc.) have made unsupervised pretraining somewhat obsolete when enough data is available (it doesn't help anymore, or not as much as it used to). The focus of deep learning research seems to have shifted a bit more towards the supervised learning side the last two years or so.
It can be used and it helps learn better features as well, it's a bit like the sparsity penalties in RBMs/Autoencoders but dropout is easier to implement and I believe a bit better, but I haven't done extensive comparisons and use a different regularization technique, personally.
Most paper's using dropout have done dropout finetuning of non-dropout pre-trained DBN/DBMs however. I'd imagine you might be able to eek out a little more performance if you incorporate dropout into your pretraining but the dropout finetuning probably negates a lot of the advantage it would have.
Denoising auto-encoders are very similar to dropout networks: http://deeplearning.net/tutorial/dA.html