Hi all,

I'm a graduate student who is implementing a CDBN for face recognition task. Well, I have some problems with my implementation. The first thing is understanding the data pre-processing. After a long investigation I found out that, for large data (150pix*150pix) or even more, 1/f whitening if preferred, and local contrast normalization should be followed. (Bruno Olshausen et al.) Well, I'm not pretty sure that my method is correct and sometimes they recommend ZCA whitening. Another problem is that there are too many heuristic issues to learn meaningful features with CRBM. I'm expecting a set of well-oriented gabor filter which H. Lee et al. have shown in their paper.


Authors used sparse coding and grouping nearby hidden units to achieve a set of well-oriented gabor filters from the natural images. I'll write down some ambiguous parts of my implementation

  1. Block-process by grouping nearby hidden units with pooling ratio. I used softmax to infer p(h|v) using the equation in the ICML2009 paper, and sampled hidden state like one normaly does in RBM. double(p(h|v) > rand( sizeof ( h ) ) ). After the pretraining step of the 1st layer CRBM, I will freeze my parameters and use them to get hidden layer activation and then do max-pooling to feed the pooled layer to 2nd layer input. I'm not quiet sure to use the probability itself or sample the states and then shrink the hidden layer into 1/c. (I used sum value of the hidden activations in block 'A' for the activation of corresponding pixel 'a' of pooling layer)

  2. I don't know my parameters are learned properly. I have no idea about typical size of pretrained weights and biases of sparse CRBM. Well, I think it depends on many other things, but at least I need an intuition that I'm doing a right job. I am using weight decay in weight update term. I simply added
    (minus)(weight.decay * weight) term to weight gradient.

  3. Deciding when to stop the training. Is their any reasonable idea to choose the stopping point for the training of CRBM?

  4. Is it possible to do backpropagation(fine-tuning) on CDBN? and does it work well? Can somebody explain me about the fine-tuning of CRBM?

Thank you :)

asked Mar 02 '12 at 12:05

Junyoung%20Jeong's gravatar image

Junyoung Jeong

edited Mar 11 '12 at 01:37

One Answer:

This was asked way long ago, but thought answering this would help others clear my conceptions!

  1. After getting the layer 1 weights, you calculate the activations of the entire images by convolving these weights on the entire images and then for each of them you apply pooling. So, you sample the states instead of using the probability.
  2. The size of weights is mentioned in the paper, you can follow that. If you are talking about other parameters like learning rate, sparsity etc, you can start with these parameters for the face dataset, layer one worked well for me, not sure about layer 2 yet, I am still running the experiments: 'num_bases': 24 'block_shape': (2, 2) [for max pooling] 'pbias': 0.01 'pbias_lambda': 5, 'regL2': 0.01 'epsilon': 0.005.
  3. I am still trying to understand the difference between fine tuning in CDBN and CDBM. If anybody can clarify, it'll be great.

answered Feb 27 at 03:38

Sharath%20Chandra's gravatar image

Sharath Chandra

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.