|
I'm wondering if someone could point me in the right direction, here. I've been working on a stacked denoising autoencoder for the purpose of generic object recognition for a while, now, and I keep butting my head up against the same problem. Essentially, I'm finding that the weights for my neurons, despite being initialized randomly, tend to converge to very similar results during pretraining. My cost function is the half mean squared error, plus a weight decay term. My first response when I saw this was to introduce another penalty determined by the degree of non-orthogonality. To that end, I used the formula developed in http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.3455 (download the cached version). The problem I experienced with this is that random noise is orthogonal, so adding this term tended to reinforce the noise used to initialize the weights. What other approaches could I use to solve this? I've been testing using caltech 256, which admittedly isn't a very good dataset, but there's next to no variation whatsoever in my pretrained weights. Edit: If this non-orthogonal cost function still has merit, are there any more recommended papers? Possibly related: http://metaoptimize.com/qa/questions/13401/deeplearningtutorials-using-the-stacked-autoencoder-results-in-almost-identical-semantic-hashes |
|
well I have two suggestions a) increase the weight standard deviation b) reduce the number of neurons if you are able to get good reconstructiuon with similar weights it suggests you have too many Increasing the standard deviation seems to have had an effect. I'm still trying to improve the performance, however. Currently, I'm getting about five or six unique blobs, as opposed to one blob and its negative. I'll leave this open in the hopes of getting some more responses.
(Jul 15 '13 at 01:55)
Phox
|
Not a lot of activity, yet, but I did happen to come across the following paper, which has a good suggestion for the range of initial values for my weights. I'm putting it here for the benefit of anyone else with a similar problem. (There's some other good suggestions I'd like to try out later, as well)
http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf