3 Comments

In a time series context, where we learn from the past to predict the future, we may have to make do with a single test set consisting of the most recent x% of data. I wouldn't want to throw away a chunk of data every time I evaluate a model. But we would want to evaluate on the test set only when absolutely necessary; for example, to validate the final model before deployment.

Expand full comment

So when you merge the test set with training and validation sets then create a entirely new split (yielding a new test set) I can then use this test set the same way even though the model was technically exposed to the entire sets (training, Val, test)?

Expand full comment

Yes. The model would have been exposed to those examples in previous runs, which of course, won't have any influence on the model because you are training a new model in every iteration.

Also, the recommended formulation for train, test and val sets has less to do with the fact that the model was exposed in previous training runs.

Instead, it is more about the fact that the person who's training the model will be inevitably influenced by the results if the same test set has already been used before. If we generate a new test set every time, we can eliminate that bias.

Expand full comment