Hi,

I have been trying to understand Dr. Lee's original code for convolutional RBM's (see here)

And what is a bit confusing is what is done during inference. see below for the code for the inference function.

In the code, imdata is just a minibatch of images. W, hbias_vec are the Conv-RBM weights, and hidden biases. pars is a matlab structure for the parameters of the network.

I have trouble understanding why pars.std_gaussian is being used as the bolded portion below. pars.std_gaussian is initially 0.2 and drops to 0.1 in about 70 epochs. So it sort of 'magnifies' the effect of the positive activites that are to reach the hidden units, by 100 times.

In case we had gaussian units, this would qualify for sort of an 'unlearned' variance for the gaussians, right?

I am working MNIST which has been shown to be learnable on binary units. So if i want to use this code for binary units,would it be sensible to set pars.std_gaussian to 1?

However, in that case I am unable to get stroke detectors even in 1000 epochs of training. So the role of std_gaussian to magnify the activities seems to important. So I am really confused how to theoretically justify this. Can anyone help?

Thanks


function [poshidexp2] = tirbm_ inference(imdata, W, hbias_ vec, pars)

ws = sqrt(size(W,1)); numbases = size(W,3); numchannel = size(W,2);

poshidprobs2 = zeros(size(imdata,1)-ws+1, size(imdata,2)-ws+1, numbases);

poshidexp2 = zeros(size(imdata,1)-ws+1, size(imdata,2)-ws+1, numbases);

for c=1:numchannel

H = reshape(W(end:-1:1, c, :),[ws,ws,numbases]);

poshidexp2 = poshidexp2 + conv2_mult(imdata(:,:,c), H, 'valid');

end

for b=1:numbases

poshidexp2(:,:,b) = 1/(pars.std_ gaussian^2).*(poshidexp2(:,:,b) + hbias_ vec(b));

poshidprobs2(:,:,b) = 1./(1 + exp(-poshidexp2(:,:,b))); end

return


asked Feb 25 '14 at 22:47

AS1's gravatar image

AS1
16336


3 Answers:

Perhaps poshidprobs2 should be used for MNIST which is approximately binary. Also, whitening may not be needed. Can you try it and let us see the learnt weights?

answered Feb 26 '14 at 22:43

Ng0323's gravatar image

Ng0323
1567915

Hello, do you train the second layer CRBM?

For the second layer, the visible units should be binary, so i modify the ccrbm_reconstruct function as:

function negdata = crbm_reconstruct(S, W, vbias_vec, pars) ws = sqrt(size(W,1)); patch_M = size(S,1); patch_N = size(S,2); numchannels = size(W,2); numbases = size(W,3);

S2 = S; negdata2 = zeros(patch_M+ws-1, patch_N+ws-1, numchannels);

for b = 1:numbases, H = reshape(W(:,:,b),[ws,ws,numchannels]); negdata2 = negdata2 + conv2_mult(S2(:,:,b), H, 'full'); end

for c=1:numchannels negdata2(:,:,c) = 1/(pars.std_gaussian^2).(negdata2(:,:,c) + vbias_vec(c)); negdata2(:,:,c) = 1./(1 + exp(-negdata2(:,:,c))); end negdata = 1negdata2; end

but i am not sure whether it is right or not?

answered Apr 16 '14 at 07:17

xue's gravatar image

xue
16223

I have tried it on MNIST keeping the sigma _start and sigma _stop =1 each, so as to nullify their effect in the inference function. This lead to filters that probably werent learnt very well with 1400 epochs.(here) (learning rate=0.00001, num _epochs=1400)

If I increase the learning rate to 0.0001, I get filters that are clones of each other. here

Removing whitening was a good suggestion, but when I did that, I got blobbish filters. here (learning rate=0.0001, num _epochs=1400) This is strange.

However keeping the whitening, as well as sigma _start=0.2 and sigma _stop=0.1 gave very good stroke detectors. here (learning rate=0.001, num _epochs=1400)

Apart from the parameters mentioned here, num _bases=40,receptive _fields=10x10, the rest are according to the original code.

So this gets weird. Would anyone be willing to try out these params on MNIST using this code. I want to be sure I didn't do anything odd while doing this.

In particular just setting the sigma _start and sigma _stop to 1.

answered Feb 27 '14 at 06:36

AS1's gravatar image

AS1
16336

edited Feb 27 '14 at 07:12

Did you try function [poshidprobs2] = .... instead of [posthidexp2]. Similarly for reconstruction function. I think MNIST should use binary RBM. So both inference and reconstruction should have sigmoid activations.

(Feb 28 '14 at 06:56) Ng0323

The learning rate is low. Please try with .05. And within 500 epochs you can get some patterns. I got good features with Natural images, didnt try with MNIST though.

(Feb 28 '14 at 21:09) Sharath Chandra
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.