Say I have dataset with a numerical feature, that can only have integer values, for example, the number of links in a document. My goal is to classify the dataset, and I'm using a Random Forest for that. Does it make sense to bin that feature before feeding the dataset to the Random Forest to avoid over fitting?

asked Apr 28 '13 at 04:56

r0u1i's gravatar image

r0u1i
1112

edited Apr 28 '13 at 04:56


One Answer:

The RF does the binning internally when it builds the tree, but it's based on some defined criteria. Why not just let the RF handle that?

answered May 01 '13 at 01:33

digdug's gravatar image

digdug
245111620

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.