|
I wonder what "filter" and "feature map" mean when architectures of a convolutional neural networks are described. Some paper says "the layer has M feature maps" and another paper say "the layer convolves N filters." What is the relation between these numbers, M(=number of feature maps) and N(=number of filters)? For example, if
then is it that m-th layer convolves M1*M2 filters? Or M2 filters? Another example is, if
then does 1st hidden layer have 4(=16/4) feature maps? Or 16? Is it possible they are, in fact, 10x10x4 3D filters? |
|
The term 'feature map' is usually used to mean the result of convolving one of the filters with the input. So in general the number of feature maps will be equal to the number of filters. As an example, if you have 32x32 input images and a convolutional layer with 16 5x5 filters, you will get 16 28x28 feature maps at the output of this layer. So to answer your questions: the m-th layer convolves M2 filters. In the usual case, each of these filters sees all M1 feature maps of the previous layer. Sometimes, filters are restricted to see only part of the feature maps of the previous layer (this is called a 'sparse convolution layer' in cuda-convnet). The 1st hidden layer in your second example will output 16 feature maps, because it has 16 filters. |