|
I am trying to implement Linear - NReLU RBM prototype with C#. Visible units has linear activation functions
hidden units - NReLU with gaussian noise
I've decide to start experiments with MNIST dataset. So, in progress of training I've seen a reconstructed images (v -> h -> v) and realized that reconstructed images contains a gray background with different intensity in different images:
WEIGHTS (not features!) are strange too (showed first 100 out of 500 hidden neurons):
Histogram of weights and biases centered around zero and looks normal. Previously I've implemented binary - binary RBM, and there was not this artefact, so it's a 'feature' of my implementation Linear-NReLU RBM. I have no ideas where to go. Is it bug? Any ideas for debugging? Is there enough information to suppose what's wrong or need something more? Thanks and sorry for bad English! Vyacheslav Added later. With hidden biases = 2.5f, weights completely lost sparsity (showed 500 out of 500 hidden neuron)
but reconstruction error drops about 30%
I guess it is a wrong way.. |
|
Those hidden features look fine (sparser than I'd expect though which some people would consider good) but have too much noisyness/speckling. My guess is some of your NReLU's are getting stuck at always zero and never training. Try initializing bias a to small positive to fix this? Alternatively in a few video's Hinton has said that a lot of times people use too high of a learning rate for NReLU's in RBMs so try lowering by a factor of 10, I think 0.001 usually works. If the learning rate is too high the units can get stuck positive or negative in a similar fashion creating speckling. Thanks for your reply, Newmu. I've found misinformation in my title post. The image shows WEIGHTS, not features (100 hidden neurons (out of 500) with 784 weights on each neuron). My mistake, sorry. And I forgot to mention, that MNIST dataset has been normalized to mean = 0, variance = 1 before training. I am using LR = 0.00002f, bigger values results in instabilities (reconstruction error raises during training). I've tried to play with hidden biases, as you advise. By default I was using zero hidden biases (Hinton: A Practical Guide to Training Restricted Boltzmann Machines, 8.1). Setting hidden biases to a small positive gives me nothing. But setting hidden biases to big values (2.5f) results in dramatically reduce in reconstruction error, but unfortunately sparsity of weights had been completely lost. It looks like as wrong way (not sure). I've played already with biases, weights, gaussian noise (from zero to sigmoid), learning rate, and steps count in contrastive divergence algorithm (1 and 10). Nothing helps (except very high hidden biases). Maybe MNIST dataset is not suitable for modeling by Linear-NReLU RBM? Can anyone who has this type of RBM reproduce these troubles? It would let me make sure that there is no implementation error.
(Jul 24 '13 at 06:55)
VAvd
|





