I've been reading up on convolutional neural network and in the process of implementing a very basic one but got stuck on one part. I'll use this network as an example:

alt text

My question is, how is the 2nd convolution layer calculated, that is "Feature maps 12@8x8"?

Let's say I'm interested in one of the 12 feature maps. I've come up with the following possibilities:

  1. 5x5 convolve with all connecting feature maps in the previous layer (4@12x12), sum results, apply non-linear function

  2. 5x5 convolve with all connecting feature maps in the previous layer (4@12x12), apply non-linear function, sum results

  3. 5x5 convolve with all connecting feature maps in the previous layer (4@12x12), sum weighted results (more params to learn), apply non-linear function

  4. 5x5xN 3D convolution ?

  5. none of the above :)

asked Apr 19 '13 at 12:08

Nghia's gravatar image

Nghia
46447


One Answer:

It is option 1 (kind of). For each of the 12 output filter maps in second layer, use 4@5x5 filters (i am assuming that is your filter size) to convolve all the previous layer outputs (4@12x12) to get a single output (1@8*8 when using valid convolution), followed by a non-linearity. Note that in all you would have 12x4 filters in the second layer.

Interesting that you asked about the second layer, given that it is the same procedure as first! Think about the input as an RGB image instead of gray. Check Jarrett, 2009 for reference.

answered Apr 19 '13 at 14:33

Rakesh%20Chalasani's gravatar image

Rakesh Chalasani
2641210

1) was what implemented, but wasn't 100% sure, thanks for confirming it! I'm using a fully connected second layer with the previous layer and noticed the output of the convolution all look very similar. I guess this is why they randomly connect the layers.

(Apr 19 '13 at 20:00) Nghia
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.