I am implementing the convolutional RBM from Dr Honglak Lee with c++, and I have successfully trained the first layer filter. But when I train the second layer, I get the following question:

1.the pooling map used as second layer input are binary unit {0,1}, but the input in the first layer is whitened, will I need to whiten the pooling map or other method to preprocess the second layer input?

2.I have tried whitening the pooling map as input and train the second layer same as the first layer, while I visualize the second layer filter, I can't get the facial part, but the contours, corners as Lee trained the second layer using natural image.I am wondering is the wrong come from the training part or the visualzation part?

Here is my implement:

1). take 128x128 grey scale image as input, the first layer has 32 8x8x1 filters(x1 means 1 channel), pooling rate is 2, so the output pooling map is 32x64x64(I add some padding so that the feature is 128x128, and pooling map is 64x64)

2). the second layer input as above is 64x64x32(32 is channel number same as previous layer filter number), the second layer has 64 8x8x32 filters, pooling rate is 2, training as above

3).take one 8x8x32 second layer filter as an example for visualization, I unpool the filter to 32x16x16(each channel seen as single filter), full convolve with the first layer filter 32x8x8 to 32x23x23, add all 32 channel to 1, so that I get the final 23x23 filter image

This question is marked "community wiki".

asked Jun 09 '14 at 23:11

niki's gravatar image

niki
1111

I don't think you need to whiten the second layer's input, it's not mentioned in the paper. There is another post here that describes how to obtain the second layer visualizations, which is some sort of weighted linear combination with the first layer (I am not sure if this is your method too). However, I can't seem to find the part of the paper that describes this in detail, so if anyone can supply this info it will be great.

(Jun 10 '14 at 04:32) Ng0323
Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.