|
Where can I find an open-source implementation of ensembles of decision trees, preferably boosted decision trees? I have seen code for implementing single decision trees. I have seen code for random forests. But where can I find an off-the-shelf implementation of (boosted) ensembles of decision trees? [Note: My thesis parser minimizes the l1-regularized logistic loss by boosting ensembles of decision trees, and learns over the regularization path. The trees are quite sparse, given the l1-regularization. However, the code is coupled with the parser and does not work out-of-the-box.] |
|
I think OpenCV's implementation is pretty fast and useable (there are bindings to many languages), although not as intuitive to understand as I would like it. |
|
The scikit-learn project just received two pull requests: the first one on random forests and the other on boosted decision trees. They are currently under review and those contributions will probably get combined soon. Anyone knowledgeable with the algorithms can help review and comment those contributions (even if no scikit-learn contributor yet). Can you link to the specific pull requests?
(Apr 08 '11 at 21:04)
Joseph Turian ♦♦
1
I rephrased my answer to make it explicit that the links point to the pull requests (and not wikipedia articles as you migh have thought).
(Apr 08 '11 at 21:08)
ogrisel
|
|
To state the obvious, WEKA provides a Java implementation. Also if it ever becomes public, the Avatar project implements them in C or C++. You may be able to ask for the code from one of the maintainers. For a quick test of the effectiveness of Weka's random forrest implementation, I suggest you the RapidMiner which includes it.
(Apr 09 '11 at 16:07)
Lucian Sasu
|
|
if you are still looking, there are a number of tree ensemble methods available in R, particularly randomForest, gbm & mboost. gbm & mboost implement stochastic gradient boosting. gbm is the most scalable of these. they are all available on cran. you can search for more on crantastic.org |