|
Hi, I implemented the Convolutional DBN using the base code given at Honglak Lee's website. However I am unable to reproduce the results of his ICML 2009 paper where first layer learns edges, second layer learns object parts and third layer represents the whole objects. I have two questions in this regard:
The pictures are attached: My parameters are as follows: Weights size (Nw): 12x12 and 10x10 for layers 1 and 2 resp. Number of groups of weights (No. of hidden neurons - K): 24 and 40 resp. target sparsity: 3 learning rate for sparsity update: 3 number of epochs: 1000 and 500 resp. spacing - C: 2 and 2 resp. |
|
I have tried the code for training Convolutional RBM in the homepage of Honglak Lee. The code aims at training one-layer CRBM. I attempt to use the activations (denoted as 'poshidprobs ' in the code) of the first layer CRBM as input to train a second-layer CRBM. But, I find that the values of 'poshidprobs ' are almost updated to 0 after a round of iteration. I am confused about this problem. I very appreciate that you could help me figure out it, and indicate me how to train the second-layer CRBM. Looking forward to your reply. You can have a look at my answer on: http://metaoptimize.com/qa/questions/15359/convolutional-rbm-features#15364 Let me know if you have any issues later on.
(Apr 13 '14 at 01:44)
Sharath Chandra
I have read the answer in the link. Maybe it is rational that the activations ('poshidprobs ') of the CRBM are almost updated to 0 for the additional constraint imposed on hidden units. I don't know whether my understandering is right. But, I am still not certain whether it's right to use these 0 activations for training next layer.
(Apr 13 '14 at 04:19)
xue
You should actually be taking poshidstates. It goes like this: 1. You train a layer of weights. 2. You trim the inputs to get them suited for convolution 3. You pass them through inference to get poshidexp 4. You get poshidstates by passing poshisexp through tirbm_sample_multrand2 5. These are you convolved Layer 1 features 6. Apply max pooling on them and you get your input for the next layer. If you study the paper, the states are inferred from the probabilities. So it is the states that we need to be passing. And moreover, most of them will be 0, because of the constraint of probabilitic max-pooling. In a group of hidden units we want at most 1 unit to be on. Hope this helps :)
(Apr 13 '14 at 06:12)
Sharath Chandra
Thanks for the explanation. It helps me much. I will try following your advices.
(Apr 13 '14 at 07:50)
xue
1
Using the weight shape 10, num channels 24 (in layer 2) and number of bases 40, I get the weight W of dimension 100x24x40! When visualising W with function “display_network”, I can not get the picture shown in the paper, which cinsists of 40 bases. Instead, I get a picture of 24*40 bases. I see you also confront with this problem, and get it working now! I check out the paper "Sparse deep belief net model for visual area V2" mentioned by Ng0323. But I am still not clear how the weights in the second layer are reviewed in combination with first layer weights for showing the 40 bases.
(Apr 14 '14 at 02:55)
xue
See this image on how to visualize layer 2 weights: https://www.flickr.com/photos/sharathchandra/13844120203/ Other methods to do this are mentioned in this paper: http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/247 It is very difficult to tune the parameters. I could not fully succeed in doing so. But I got them to some extent. You can read the journal version of ICML CDBN paper for more details.
(Apr 14 '14 at 06:19)
Sharath Chandra
Yeah that is the paper in Comm. of ACM. They gave the parameters in that paper. In Layer 2 you get parts of faces. I could get decent visualization, though not as smooth as in the paper. Layer 3 faces, I could get the overall picture, but the details of eyes,nose etc were misplaced.
(Apr 15 '14 at 02:50)
Sharath Chandra
Could you please send me a copy of your code, so that i can check what is wrong with my own? I want to make it work urgently. Or could I send my code to you, and bother you to check it for me ? Thanks in advance. My emai is [email protected]
(Apr 15 '14 at 05:34)
xue
I use the face images in the multi-pie dataset. Before training, I resize them to make the longer side as 150. target sparsity pbias = 0.002 for 1-layer, 0,005 for 2-layer; I am not sure how to set pbias_lambda :the learning rate for sparsity udpate? Could you indicate me how to set it? For now, i set it as 5 and 2 for 1-layer and 2-layer. I still can't get the parts of faces-like bases. Moreover I find that, the visualization of 2-layer weights from my code is just like pasting the 1-layer weights on the 2-layer weights. Local parts of 2-layer weights pic is just the same as 1-layer weights pic. They don't show the parts of faces.
(Apr 15 '14 at 05:48)
xue
Can you show me some weights you are getting in layers 1 and 2 for different values of parameters? You can share the pics on flickr or any other site.
(Apr 15 '14 at 05:59)
Sharath Chandra
Please update the weights. It says page not found
(Apr 15 '14 at 11:15)
Sharath Chandra
Ccould you open the link to the iamges: http://photo.blog.sina.com.cn/u/2745449403 I transfer the code as images, and put it there. I make the following modifications for training the 2-layer CRBM by using the code in ICML 09, named as 1 to 4. Or could you give me your email, so that i can send it to you?
(Apr 15 '14 at 21:14)
xue
A few clarifications: 1. Layer 1 is trained on natural images 2. Activations for Layer 1 are got using faces dataset 3. You will have to play around with the parameters a little more. Make note of how the features are changing when you are changing the parameters and keep a track record of it. This will give you a rough range for the dataset you are using. For Caltech faces dataset, the parameters in the communications of ACM will work (of course there we still had to tune one parameter.) I could get farily good bases for layer 2. But layer 3 was tough.So even for your dataset you should be able to get layer 2 bases minimally. 4. If you code the convolution according to the equation I showed you in the image, it will work. Please verify if both match.
(Apr 16 '14 at 01:07)
Sharath Chandra
Ok, thanks for the explanations. I just train Layer 1 on 10 images provided by the code of ICML 09. I can't reach the link in the paper (http://www.cnbc.cmu.edu/cplab/data kyoto.html) to get the whole Ky- oto natural image dataset. Do you get that dataset? Another problem is how to set the parameter "batch_ws", when i resize the input face image to make the longer side as 150. rowidx = ceil(rand(rows-2ws-batch_ws))+ws + [1:batch_ws]; colidx = ceil(rand(cols-2ws-batch_ws))+ws + [1:batch_ws]; imdata_batch = imdata(rowidx, colidx); The "imdata_batch" sampled from the original image of size (150/130) is used as input for training CRBM.
(Apr 16 '14 at 02:34)
xue
When training the second layer, where do I have to modify the code of the ICML 09? My results still seem wrong. could you attach the results of the weights learned? I want to see how it should look like.
(Apr 16 '14 at 23:25)
xue
I replied on your question's thread. have a look
(Apr 17 '14 at 01:25)
Sharath Chandra
In my understanding poshidprobs are mostly 0 because of weight regularization (dW_total2 in the code) or sparsity (dW_total3, but this is set to 0 when finding dW_total). So it's a sparsity thing. Max pooling just chooses which unit to turn on or off. The paper mentions that activations are used to as the input to the next layer (sec 3.5). I interpret activations as poshidprobs, but I could be wrong.
(May 13 '14 at 06:13)
Ng0323
showing 5 of 17
show all
|
|
Here you go: http://ai.stanford.edu/~hllee/softwares/icml09.htm |
|
Hi, I have been trying to obtain a verified code for Convolutional RBMs for some time now. Could you please link to the base code. |
Which code are you using? I can't seem to find any sparsity parameters in the CRBM code (which you linked in another post)
Its the same code posted in other page. 2. The weights are of dimension: (weight_shape^2, num. of channels, num. of bases). So, given weight shape is 10, num channels 24 (in layer 2) and number of bases 40 - it makes it 100x24x40!
The parameters for this code are: ws, num_bases,pbias, pbias_lb, pbias_lambda,spacing, epsilon, l2reg, batch_size. So, here pbias is the target sparsity pbias_lambda is the learning rate for sparsity udpate pbias_lb isn't used for anything
Check out the paper "Sparse deep belief net model for visual area V2" by Honglak Lee, C. Ekanadham and Andrew Ng I think the answer is in there.
I got the answer for the first question. The weights in the second layer have to be viewed in combination with first layer weights. Its working now.
Still looking out for the secon question's answer (right parameters! atleast for the faces Caltech 101 dataset)
Hi, have you used the features learned from the CDBN for recognition? Because of the sparsity constraint, most of the hidden units are updated to be 0. It seems not rational to use the the 0 activations ('poshidprobs ') as features? Then, what should we use as features for recognition?
Thanks in advance.