Often when training an ML system, you might split the data into three parts. Commonly, this split is made into a larger training set, and two more or less similarly sized sets for development and evaluation, respectively.

After having selected features and tuned the system's parameters using the development set, can the development data be safely added to the training data?

Are there any disadvantages in including the development data in the training data before performing actual evaluation?

The way I see it, this should be a very simple way of increasing your training set's size in a seemingly harmless manner - but I might be wrong. :)

asked Sep 24 '13 at 05:13

Johannes%20Bjerva's gravatar image

Johannes Bjerva
1112

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.