|
Hello all, In the classical book "The elements of statistical learning", in chapter 7 one can find the classical behavior for a ML system, discussed on the model's complexity. This is depicted in Figure 7.1, page 220. It starts with the underfitting region which is characterized by poor results on both training and validation sets, then it follows a zone of minimum validation error, and finally an overfitting zone: the training error continues to decrease, while the validation error goes up. My question is: in your practice, did you always encounter this succession of zones? Is it possible, for example, that in the region normally expected to obtain an underfitting behavior, one obtains increasing values for both training and validation sets? Is the succession a must one always has to encounter in a ML process? Lucian PS: disclaimer - I asked the same question on quora.com |
|
No, this is unfortunately not true. If this were true life would be much easier, as one would then tune hyperparameters with convex optimization, which is much cheaper than grid or random search, which are the most commonly used techniques. In practice the test loss can oscillate as one hyperparameter varies. As two or more vary it can get really wiggly. You can see a very benign example of a non convex surface on the libsvm website. Even with some hyperparameters such as number of training iterations I've seen behavior where test loss goes down, then up, then down further as one optimizes the training objective with certain regularizers. |