I'm following this link to implement convolution-based feature extraction,

Link: http://ufldl.stanford.edu/wiki/index.php/Exercise:Convolution_and_Pooling

Questions

1) Why is the autoencoder applied on 8x8 image samples to get the optimal weights, by which filters of 8x8 are extracted and convoluted on 64 x 64 image samples?

2) On what criteria the 8x8 image samples are selected for the autoencoder? Let's say the only images given are of size 64x64, then how would the filters be created? If we apply autoencoders on 64x64 images, we will get filters of the same size which in convolution would be implausible.

Thanks in advance...

asked Aug 15 '13 at 13:06

Issam%20Laradji's gravatar image

Issam Laradji
1217912

edited Aug 15 '13 at 14:28


One Answer:

The answers are there in the notes preceding the exercise.

  1. Natural Images are stationary in nature, therefore features can be learned on a patch rather than the whole image - faster and less computationally intensive. To get a representation of the whole image, it is convolved with the full image.
  2. In the examples given, the patches are selected randomly. The whole idea of convolution is to reduce the number of weights learnt. So the idea is to learn features on small patches are convolve them with the bigger images.

Please correct me if I am wrong.

answered Feb 05 '14 at 23:33

Sharath%20Chandra's gravatar image

Sharath Chandra
311131621

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.