Monday, June 1, 2015

Exact Models and Cross Validation from Warm Starts

One of the most common tasks when performing model building is to do some Cross Validation. Computationally speaking, this can be a very performance intensive deal. If you want to do 10-fold cross validation, you're basically going to be waiting ~10x as long for your model to build.

However, there is a potential speed up available for Cross Validation but only when using models that converge to an exact solution. First you train 1 model on all the data, and then train the 10 CV models using the first model as a warm start. This way we are training 11 models, but the 10 CV models will train significantly faster - and should be enough of a difference to offset the training time.

The use of models that converge to a singular exact solution is critical for this to be a sensible idea. The point of the CV is to keep data separate, and evaluate on data you haven't seen/trained on. By warm starting from a model trained on everything, the CV models have kinda seen all the data before. But because the model converges to an exact solution, it would be the same solution regardless of the warm start. The model basically forgets about the earlier data.

If we had used a model that does not converge to an exact solution, or has multiple local minima - this approach would bias the CV error. In the former case, we may not move all the way from the first solution we found. Being closer to the initial solution means we may have enough bias left to "remember" some of what we saw. This is even worse in the later case - the initial solution on all the data may put you in a local minima that you wouldn't have been in if you hadn't warm started, which basically guarantees you won't be able to get too far away.

Hopefully I'll get some time to experiment with actually implementing this idea.