Hi,

I implemented the Convolutional DBN using the base code given at Honglak Lee's website.

However I am unable to reproduce the results of his ICML 2009 paper where first layer learns edges, second layer learns object parts and third layer represents the whole objects.

I have two questions in this regard:

  1. When we say object parts are learned, is it that the weights are learning these parts (i.e, vizualising the weights will give raise to the different object parts?) or is is that the layer 2 activations will give raise to the learned object parts? I understand from that paper that the weights themselves depict the learned object parts. But wanted to confirm.

  2. I am able to get the layer 1 edges quite similar to that shown in the paper. But when learning layer 2, the weights are turning out to be 3D vector with multiple number of channels. Therefore even if my weight shape is 10x10 and number of bases is 40, I see many more weights when I visualise the W variable.

The pictures are attached:

layer 1

layer 2

My parameters are as follows:

Weights size (Nw): 12x12 and 10x10 for layers 1 and 2 resp. Number of groups of weights (No. of hidden neurons - K): 24 and 40 resp. target sparsity: 3 learning rate for sparsity update: 3 number of epochs: 1000 and 500 resp. spacing - C: 2 and 2 resp.

asked Feb 17 '14 at 21:05

Sharath%20Chandra's gravatar image

Sharath Chandra
311131621

edited Feb 17 '14 at 21:07

  1. yes that's my understanding too. the weights or filters act as feature detectors
  2. i think you'll get 24*40 set of weights in layer 2

Which code are you using? I can't seem to find any sparsity parameters in the CRBM code (which you linked in another post)

(Feb 19 '14 at 23:11) Ng0323

Its the same code posted in other page. 2. The weights are of dimension: (weight_shape^2, num. of channels, num. of bases). So, given weight shape is 10, num channels 24 (in layer 2) and number of bases 40 - it makes it 100x24x40!

The parameters for this code are: ws, num_bases,pbias, pbias_lb, pbias_lambda,spacing, epsilon, l2reg, batch_size. So, here pbias is the target sparsity pbias_lambda is the learning rate for sparsity udpate pbias_lb isn't used for anything

(Feb 20 '14 at 01:17) Sharath Chandra

Check out the paper "Sparse deep belief net model for visual area V2" by Honglak Lee, C. Ekanadham and Andrew Ng I think the answer is in there.

(Feb 20 '14 at 02:49) Ng0323

I got the answer for the first question. The weights in the second layer have to be viewed in combination with first layer weights. Its working now.

Still looking out for the secon question's answer (right parameters! atleast for the faces Caltech 101 dataset)

(Feb 21 '14 at 01:21) Sharath Chandra

Hi, have you used the features learned from the CDBN for recognition? Because of the sparsity constraint, most of the hidden units are updated to be 0. It seems not rational to use the the 0 activations ('poshidprobs ') as features? Then, what should we use as features for recognition?

Thanks in advance.

(Apr 21 '14 at 09:01) xue

3 Answers:

I have tried the code for training Convolutional RBM in the homepage of Honglak Lee. The code aims at training one-layer CRBM.

I attempt to use the activations (denoted as 'poshidprobs ' in the code) of the first layer CRBM as input to train a second-layer CRBM.

But, I find that the values of 'poshidprobs ' are almost updated to 0 after a round of iteration.

I am confused about this problem.

I very appreciate that you could help me figure out it, and indicate me how to train the second-layer CRBM.

Looking forward to your reply.

answered Apr 12 '14 at 10:13

xue's gravatar image

xue
16223

You can have a look at my answer on: http://metaoptimize.com/qa/questions/15359/convolutional-rbm-features#15364

Let me know if you have any issues later on.

(Apr 13 '14 at 01:44) Sharath Chandra

I have read the answer in the link. Maybe it is rational that the activations ('poshidprobs ') of the CRBM are almost updated to 0 for the additional constraint imposed on hidden units. I don't know whether my understandering is right. But, I am still not certain whether it's right to use these 0 activations for training next layer.

(Apr 13 '14 at 04:19) xue

You should actually be taking poshidstates. It goes like this: 1. You train a layer of weights. 2. You trim the inputs to get them suited for convolution 3. You pass them through inference to get poshidexp 4. You get poshidstates by passing poshisexp through tirbm_sample_multrand2 5. These are you convolved Layer 1 features 6. Apply max pooling on them and you get your input for the next layer.

If you study the paper, the states are inferred from the probabilities. So it is the states that we need to be passing. And moreover, most of them will be 0, because of the constraint of probabilitic max-pooling. In a group of hidden units we want at most 1 unit to be on.

Hope this helps :)

(Apr 13 '14 at 06:12) Sharath Chandra

Thanks for the explanation. It helps me much. I will try following your advices.

(Apr 13 '14 at 07:50) xue
1

Using the weight shape 10, num channels 24 (in layer 2) and number of bases 40, I get the weight W of dimension 100x24x40! When visualising W with function “display_network”, I can not get the picture shown in the paper, which cinsists of 40 bases. Instead, I get a picture of 24*40 bases. I see you also confront with this problem, and get it working now! I check out the paper "Sparse deep belief net model for visual area V2" mentioned by Ng0323. But I am still not clear how the weights in the second layer are reviewed in combination with first layer weights for showing the 40 bases.

(Apr 14 '14 at 02:55) xue

See this image on how to visualize layer 2 weights: https://www.flickr.com/photos/sharathchandra/13844120203/ Other methods to do this are mentioned in this paper: http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/247

It is very difficult to tune the parameters. I could not fully succeed in doing so. But I got them to some extent. You can read the journal version of ICML CDBN paper for more details.

(Apr 14 '14 at 06:19) Sharath Chandra

Yeah that is the paper in Comm. of ACM. They gave the parameters in that paper. In Layer 2 you get parts of faces. I could get decent visualization, though not as smooth as in the paper. Layer 3 faces, I could get the overall picture, but the details of eyes,nose etc were misplaced.

(Apr 15 '14 at 02:50) Sharath Chandra

Could you please send me a copy of your code, so that i can check what is wrong with my own? I want to make it work urgently. Or could I send my code to you, and bother you to check it for me ?

Thanks in advance. My emai is [email protected]

(Apr 15 '14 at 05:34) xue

I use the face images in the multi-pie dataset. Before training, I resize them to make the longer side as 150. target sparsity pbias = 0.002 for 1-layer, 0,005 for 2-layer; I am not sure how to set pbias_lambda :the learning rate for sparsity udpate? Could you indicate me how to set it? For now, i set it as 5 and 2 for 1-layer and 2-layer. I still can't get the parts of faces-like bases. Moreover I find that, the visualization of 2-layer weights from my code is just like pasting the 1-layer weights on the 2-layer weights. Local parts of 2-layer weights pic is just the same as 1-layer weights pic. They don't show the parts of faces.

(Apr 15 '14 at 05:48) xue

Can you show me some weights you are getting in layers 1 and 2 for different values of parameters? You can share the pics on flickr or any other site.

(Apr 15 '14 at 05:59) Sharath Chandra

Please update the weights. It says page not found

(Apr 15 '14 at 11:15) Sharath Chandra

Ccould you open the link to the iamges: http://photo.blog.sina.com.cn/u/2745449403

I transfer the code as images, and put it there. I make the following modifications for training the 2-layer CRBM by using the code in ICML 09, named as 1 to 4.

Or could you give me your email, so that i can send it to you?

(Apr 15 '14 at 21:14) xue

A few clarifications: 1. Layer 1 is trained on natural images 2. Activations for Layer 1 are got using faces dataset 3. You will have to play around with the parameters a little more. Make note of how the features are changing when you are changing the parameters and keep a track record of it. This will give you a rough range for the dataset you are using. For Caltech faces dataset, the parameters in the communications of ACM will work (of course there we still had to tune one parameter.) I could get farily good bases for layer 2. But layer 3 was tough.So even for your dataset you should be able to get layer 2 bases minimally. 4. If you code the convolution according to the equation I showed you in the image, it will work. Please verify if both match.

(Apr 16 '14 at 01:07) Sharath Chandra

Ok, thanks for the explanations. I just train Layer 1 on 10 images provided by the code of ICML 09. I can't reach the link in the paper (http://www.cnbc.cmu.edu/cplab/data kyoto.html) to get the whole Ky- oto natural image dataset. Do you get that dataset?

Another problem is how to set the parameter "batch_ws", when i resize the input face image to make the longer side as 150.

rowidx = ceil(rand(rows-2ws-batch_ws))+ws + [1:batch_ws]; colidx = ceil(rand(cols-2ws-batch_ws))+ws + [1:batch_ws]; imdata_batch = imdata(rowidx, colidx);

The "imdata_batch" sampled from the original image of size (150/130) is used as input for training CRBM.

(Apr 16 '14 at 02:34) xue

When training the second layer, where do I have to modify the code of the ICML 09? My results still seem wrong.

could you attach the results of the weights learned? I want to see how it should look like.

(Apr 16 '14 at 23:25) xue

I replied on your question's thread. have a look

(Apr 17 '14 at 01:25) Sharath Chandra

In my understanding poshidprobs are mostly 0 because of weight regularization (dW_total2 in the code) or sparsity (dW_total3, but this is set to 0 when finding dW_total). So it's a sparsity thing. Max pooling just chooses which unit to turn on or off. The paper mentions that activations are used to as the input to the next layer (sec 3.5). I interpret activations as poshidprobs, but I could be wrong.

(May 13 '14 at 06:13) Ng0323
showing 5 of 17 show all

Here you go: http://ai.stanford.edu/~hllee/softwares/icml09.htm

answered Feb 21 '14 at 01:19

Sharath%20Chandra's gravatar image

Sharath Chandra
311131621

Hi, I have been trying to obtain a verified code for Convolutional RBMs for some time now. Could you please link to the base code.

answered Feb 21 '14 at 00:59

AS1's gravatar image

AS1
16336

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.