I have a bit trouble implementing a gaussian-rectified RBM. I would like to show you how I would implement it. It would be nice if you point out errors or comment the implementation.

These are the key parts in how I would implement this RBM (in a Matlab-like fashion):


hiddenUnitActivation:

pre = v * W + hbias;
h = max(0, pre);

hiddenUnitSample:

pre = v * W + hbias;
h = max(0, pre + randn(size(pre)) * sigmoid(pre));

visibleUnitActivation:

v = h * W' + vbias;

visibleUnitSample:

pre = h * W' + vbias
v = pre + randn(size(pre)) * sigma (???)

CD-1 Learning:

posVisProbs = batch;
posHidProbs = hiddenUnitActivation(posVisProbs);
posHidStates = hiddenUnitSample(posVisProbs);

negVisProbs = visibleUnitActivation(posHidStates);
negHidProbs = hiddenUnitActivation(negVisProbs);

Weight Updates:

dW = (posVisProbs*posHidProbs)/batchSize - (negVisProbs*negHidProbs)/batchSize;
dhbias = mean(posHidProbs) - mean(negHidProbs);
dvbias = mean(posVisProbs) - mean(negVisProbs);

W += dW;
hbias += dhbias;
vbias += dvbias;

Most of all I wonder

  • if I have to normalize the data
  • if I need to consider the variance/standard deviation of the data in any way
  • if the gaussian visible units are missing some variance term in the computation of some variables (the same as the data variance?)
  • what changes when I switch to a rectified-rectified or rectified-binary RBM

I implemented it that way and it seems to work well in the scenario of dimensionality reduction/ feature learning, but only when I don't sample at all, thus, using this

hiddenUnitSample:

pre = v * W + hbias;
h = max(0, pre);

instead of this

pre = v * W + hbias;
h = max(0, pre + randn(size(pre)) * sigmoid(pre));

The noise term in the rectified unit sampling formula seems to be way to big for accurate reconstruction.

I also tried to add labels to the topmost RBM to enable some kind of "selective generation". When I clamp the label neuron and circle in the top RBM the inference of the missing modality is instable and blows up to practically +- infinity.

This does not happen when the hidden units are bounded by a sigmoid function like in gaussian-binary or rectified-binary RBMs. Any ideas how to solve this?

asked Jun 17 '14 at 07:55

SoufianJ's gravatar image

SoufianJ
1112

edited Jun 23 '14 at 04:09

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.