I am trying to build a regression model using Neural Network. The final cost measure is MAE on the output(one output unit, 200 input units)

Right now all my hidden units have rectifier activation. The output unit is just a linear unit with pass-through activation. Is this an effective network? It seems the network can not learn efficiently, the error(even on training set) oscillates. I tried to lower learning rate, but doesn seem to be able to find a value that makes the error go down monotonically

I suspect the cost function(L1-norm) might be the culprit. Right now, when taking gradient, I either pass 1 or -1 depending on predicted value vs actual output value. Is this the right way? (Since L1 is not smooth at 0, Would this be the reason why the learning is not smooth/effective ?) What is the right way to handle L1-norm cost function ?

Thanks, Any help is appreciated!

asked May 04 '14 at 18:43

keithzhou's gravatar image

keithzhou
31226


One Answer:

Obvious approach would be to try L2 norm. For example for ConvNets L2 SVM or L2+L1 SVM as cost function outperform L1 SVM as a cost function. ( L2 SVM as cost was introduced Y Tang, and my experience support his opinion)

answered May 08 '14 at 11:54

Sergey%20Ten's gravatar image

Sergey Ten
161

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.