I had an idea this morning, and I'm interested in exploring how tractable and useful it might be. Random subspace methods are ensemble classifiers that train individual classifiers on some random subset (selected without replacement) of features d in D, where D is the total number of features in the data set. In random forests, this representation works extremely well.

Do any of you think that it would make sense to run PCA as a pre-processing step to project the highly-dimensional input into a low-dimensional input, and then select a random subset from that?

The motivation for something like this would make most sense for highly-dimensional input sources, where it might not be efficient to construct a medium sized (~100 trees) random forest. It might work in image classification or motion tracking applications.

I'm just curious if anyone might have some additional information about the combination of these techniques, or perhaps know of some papers that might have discussed it.

asked Nov 16 '11 at 10:21

kmore's gravatar image

kmore
26447


One Answer:

It's been done before. They are generally called "Rotation Forests." Here is one paper about them, http://pisuerga.inf.ubu.es/juanjo/bib2html/e-documents/publs/mcs07rotation.pdf

Supposedly they perform well, but every time I read a paper about a new ML algorithm, the paper claims it's better than all of the other methods, so I have some healthy skepticism.

answered Nov 16 '11 at 17:34

Rob%20Renaud's gravatar image

Rob Renaud
41551321

edited Nov 16 '11 at 17:35

It's even better when they cherry-pick the UCI data sets that reveal statistically significant improvements at the p < 0.05 level. Thanks for the paper- I'll give it a read tomorrow. It sounds like a pretty cool algorithm nevertheless.

(Nov 18 '11 at 04:59) kmore
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.