All or nearly all of the papers using dropout are using it for supervised learning. It seems that it could just as easily be used to regularize deep autoencoders, RBMs and DBNs. So why isn't dropout used in unsupervised learning?

asked Oct 29 '13 at 12:45

Max's gravatar image

Max
476162729

1

I don't see any reason why dropout couldn't be used with these models. I suspect the reason we're mainly seeing the technique being used in a supervised context is because some of these recent innovations (dropout, rectified linear units, Nesterov accelerated gradient etc.) have made unsupervised pretraining somewhat obsolete when enough data is available (it doesn't help anymore, or not as much as it used to). The focus of deep learning research seems to have shifted a bit more towards the supervised learning side the last two years or so.

(Oct 29 '13 at 18:29) Sander Dieleman
1

It can be used and it helps learn better features as well, it's a bit like the sparsity penalties in RBMs/Autoencoders but dropout is easier to implement and I believe a bit better, but I haven't done extensive comparisons and use a different regularization technique, personally.

Most paper's using dropout have done dropout finetuning of non-dropout pre-trained DBN/DBMs however. I'd imagine you might be able to eek out a little more performance if you incorporate dropout into your pretraining but the dropout finetuning probably negates a lot of the advantage it would have.

(Oct 29 '13 at 20:05) Newmu

Denoising auto-encoders are very similar to dropout networks: http://deeplearning.net/tutorial/dA.html

(Oct 31 '13 at 04:43) alfa
Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.