|
I am implementing Bernoulli-Bernoulli RBM and trying to debug it. I want RBM to learn uniform distribution over two binary variables. So RBM has 2 visible and 4 hidden units. It seem that RBM will end up with zero weight matrix, zero bias vector for both visible and hidden units. I use learning rate = 0.1, contrastive-divergence k = 3000 and full batch update. As this is a toy task I can compute marginal distribution over visible units. So after random initialization from normal distribution with mean = 0 and standart deviation = 0.01 I have: (b for visible bias, c for hidden bias)
After 1000 iterations
After 10000 iterations
I did not use weight decay neither momentum. So my question is why does this happens? Is it bug in my implementation or such behaivor can be explained somehow? |
|
Why must the weights and biases be zero if you aren't using weight decay? There are many settings of the RBM parameters that implement a uniform distribution. Also, are you using CD1? Remember that you aren't doing maximum likelihood training and CD doesn't really give you a real gradient. |