When building models, we often go through many stages of analysis from studying of relationships between candidate predictors and the response, variable selections and fitting many different models to see which individual models or which ensemble performs best. In the process, we often need to keep around many versions of data, code, and model objects and some custom results/ metrics of evaluation.

I've looked at some of the post earlier and specifically I thought jobman from deeplearning.net seems to do this in some direction. But has anyone tried to organize their versions of (model, data, result) into a NoSQL database (for example, as key-value stores). I use R mostly for analysis and also python. But would prefer tools that are generic enough to accommodate both.

The idea of organizing in some sort of key-value store is for look-up purposes in the future. Say you want to compare some graphs of a particular evaluated metric across all your models.

Seems like some combinations of git, NoSQL will do the job.. Remember I'd like to keep versions of (model, data, result) tuple. Since my dataset might change as well (e.g. it could have some errors I didn't catch before) so I'd like to record all histories

Has anyone have a system of organizing these that they swear by?

asked Mar 13 '13 at 12:25

Yipster's gravatar image

Yipster
16112


One Answer:

I don't know of any system that matches your requirements completely, but I find that doing the experiments using a mongodb-based job queue, i.e. using hyperopt and/or MDBQ, is quite convenient. When starting the experiment, you insert a description including the current commit hash in git for your code and your data (which you may want to store in git-annex if it's bigger). The job queue then keeps track of the settings you tried and the results.

answered Mar 14 '13 at 18:38

Hannes%20S's gravatar image

Hannes S
86229

This sounds quite intriguing. do you know any tutorial or examples code or others who are going about it the same way? Really appreciate your help.

(Mar 15 '13 at 13:14) Yipster
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.