|
Let's say I use RBM or autoencoder for unsupervised pretraining of a neural net and then use the weights to initialize the neural net for supervised training (as in examples from LISA lab Deep Learning tutorials). But I use different activations, e.g. sigmoids during pretraining, tanh or ReLU during supervised training. Do I need to scale the weights and by how much? Or I must use the same activations? |