I asked this over at CrossValidated and have six up-votes so far, but no answers, so thought I'd try over here...

Say I've got a predictive classification model based on a random forest (using the randomForest package in R). I'd like to set it up so that end-users can specify an item to generate a prediction for, and it'll output a classification likelihood. So far, no problem.

But it would be useful/cool to be able to output something like a variable importance graph, but for the specific item being predicted, not for the training set as a whole. Something like:

Item X is predicted to be a Dog (73% likely) Because: Legs=4 Breath=bad Fur=short Food=nasty

You get the point. Is there a standard, or at least justifiable, way of extracting this information from a trained random forest?

Flipping each feature and dropping the m titrated items through the n trees sounds expensive, and hard to generalize for continuous features without inspecting the trees. Counting the attributes associated with each decision node might be the right track, but for trees that vote for the majority only, or everything? Something else?

asked Apr 08 '11 at 10:52

Harlan%20Harris's gravatar image

Harlan Harris
61116

I know this question is pretty old now but I need to solve the exact same problem. What did you end up doing here? Any pointers on your approach and its success would be greatly appreciated.

(Nov 27 '12 at 19:32) beejay

3 Answers:

This is a very interesting problem, and there are some classifier-independent approaches to it. I really like Baehrens et al How to explain individual classification decisions, but Åtrumbelj et al Explaining individual classifications using game theory is also interesting. Unfortunately I don't know of any library that prepackages these techniques.

answered Apr 08 '11 at 11:12

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

The Baehrens et al. paper is indeed quite interesting. I'm not sure if it's the best approach for a DF, at least in my case (with hundreds of variables), but I definitely like their definitions. Thanks!

(Apr 08 '11 at 15:35) Harlan Harris

Here is a technique I use to perform EDA (exploratory data analysis) on the outputs of an ensemble of decision trees:

The total scored is summed over individual trees. The tree score is the score at the leaf node that the example percolates down to.

So, sort the tree scores from largest to smallest. Then, in decreasing order of tree score, output the feature path from root to leaf.

This technique will show you which compound features contributed to highest weight to the total score.

answered Apr 08 '11 at 19:50

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

@JosephTurian, I am not sure I understand your comments. I would greatly appreciate if you can explain with an example.

(Nov 27 '12 at 19:28) beejay

for model exploration of random forest I recommend the original paper by Breiman and Cutler. specifically look at the prototypes.

answered Dec 05 '12 at 17:21

Taylor%20Brown's gravatar image

Taylor Brown
1

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.