|
Is it always better to build a random forest with an ensemble of trees on a subset of data and a subset of attributes than building a decision tree with all the data and all the attributes ? Does random forests always produce better results than a single decision tree ? |
|
The question is not correct. Usually the random forest is NOT constructed "on a subset of data and a subset of attributes" - the random forest is constructed using ALL the data and ALL the attributes. ONE particular tree in the forest is constructed using the subset of data and attributes, but the subsets will be different for the other trees in the forests. In my limited experience so far it is always better to use a random forest (trained in all the data and all the attributes) than to use a single tree (again trained in all the data and attributes). The case where a single tree will be better is if you need to understand the classifier and not treat it as a black box. |