|
It turns out the update rules for regularized quasi-additive online algorithms are usually obtained by taking the convex dual of the regularizer (ref. Chapter 3 of Shai Shalev-Shwartz's PhD thesis). However, it so happens that if you don't have a regularizer (in the online setting), even then you can have the update rule as a convex dual of some other parameter (for example, the regret incurred each round). Can anyone explain why this happens or provide some pointers to some relevant literature? The Prediction, Learning, Games book does speak of it somewhat (Chap 2 and Chap 12) but I was looking for something more comprehensive and expository. Any thoughts/pointers? |
|
The Shai Shalev-Shwartz work you cited goes on to show that the online learning algorithms can be seen this way too, i.e., maximizing the dual objective (of the regularized loss function) instead of minimizing the primal objective. The primal-dual perspective is useful because it lets you see how the dual objective is improving with each round, how it relates to other notions used to analyze online learning algorithms (such as the total number of mistakes), and therefore also leads to ways of designing and analyzing some new online learning algorithms. |
|
As far as I can see from the paper that introduced this idea, I think he is not talking about the convex dual of the regularizer, but of the regularized loss. According to the paper:
So if you don't have a regularization term just take the dual of the loss. |