|
I have just started using the autoencoder package in R. http://cran.r-project.org/web/packages/autoencoder/index.html Inputs to the autoencode() function include lambda, beta, rho and epsilon. What are the bounds for these values? Do they vary for each activation function? Are these parameters called "hyperparameters"? Assuming a sparse autoencoder, is rho=.01 good for the logistic activation function and rho=-.9 good for the hyperbolic tangent activation function? Why does the manual set epsilon to .001? If I remember correctly, "Efficient Backpropagation" by LeCun recommends starting values which are not so close to zero. How much does a "good: value for beta matter? Is there a "rule of thumb" for choosing the number of nuerons in the hidden layer? For example, if the input layers has N nodes, is it reasonable to have 2N nuerons in the in the hidden layer? Can you recommend some literature on the practical use of autoencoders? |