|
Often when training an ML system, you might split the data into three parts. Commonly, this split is made into a larger training set, and two more or less similarly sized sets for development and evaluation, respectively. After having selected features and tuned the system's parameters using the development set, can the development data be safely added to the training data? Are there any disadvantages in including the development data in the training data before performing actual evaluation? The way I see it, this should be a very simple way of increasing your training set's size in a seemingly harmless manner - but I might be wrong. :) |