|
I have trained item-item similiarities with several different models. For live sessions, I want to weight and blend the models based on their actual performance in the session. I have around 10 models to blend, and the number of items to blend against can vary from zero up to several thousands in a single session. I am searching for a blending model that: a) Is very fast for live blending and b) Outputs data that allows an easy interpretation of which models perform well in a specific session, and in general (I think if the models are normalized, the weights should be a measurement for this, but I'm not sure) (I would welcome a link to a citable paper. Papers free of charge prefered) |
|
Fit a linear regression using the outputs of the learners as features. For really fast performance use an online algorithm such as the winnow or vowpal-wabbit. For linear online algorithms it makes sense to use the weights as a measure of importance. |
|
for such a low dimensionality data set, a decision tree (ala c4.5) may be a good option, applied in a similar way to that described by alexandre. while not as fast as gradient-based linear models, in this case the difference may not be noticeable, and decision trees have a nice reduction to a set of rules, a great thing to present to one's boss when describing how a model works. what's more, it's easy to back out the cause of every classification- why instance x got a predicted label y' and not label y''. This isnt as easy for linear models. |