3
2

I have question about how AdaBoost combines weak classifiers for each iteration, into a strong classifier. I use C4.5 algorithm as weak classifier algorithm. And for each iteration it produced different decision tree and alpha. How can I combine those models into one strong classifier. In the algorithm has been told that to combine them, adaboost uses formula alpha*hyphotesis. how can I combine them with that formula?

asked May 22 '12 at 00:36

tiopramayudi's gravatar image

tiopramayudi
46235

edited May 29 '12 at 15:14

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

You might want to take a look at this: http://users.rowan.edu/~polikar/RESEARCH/index_files/Ensemble.html

(May 22 '12 at 00:56) Dov

i have the same problem with you. can someone answer that question? i don't understand what the explaination in that link

(May 28 '12 at 01:47) Yogi Kurnia

try these slides on AdaBoost: cseweb.ucsd.edu/classes/fa01/cse291/AdaBoost.pdf

(May 29 '12 at 13:58) Pardis

First of all, how are you reweighting the examples at each step? Are you doing that correctly?

(May 29 '12 at 15:14) Joseph Turian ♦♦

2 Answers:

You should have a set of alphas from each round of Boosting: {a1,a2,a3,...,an}. You also should have a set of hypotheses from the rounds: {h1,h2,h3,...,hn}.

Your prediction should be:

H(x) = Sum( a1.h1(x) + a2.h2(x) + a3.h3(x) + ... +an.hn(x) )

answered May 29 '12 at 15:15

image_doctor's gravatar image

image_doctor
302

edited May 29 '12 at 15:16

Suppose you're doing binary classification. Let fi(x) be the raw output (for input x) of each decision tree you've trained (this is typically a real valued output). You need to convert fi(x) into a binary predictor hi(x) that either outputs -1 or +1.

Afterwards the combined strong predictor is
h(x) = sign( alpha1h1(x) + alpha2h2(x) + ... + alphatht(x) )

You might be asking how to generate the binary predictors hi(x) corresponding to each individual tree fi(x). Some decision trees output values between 0 and 1 corresponding to the estimated probability that x belongs to the positive class. In that case, all you need to do is:
hi(x) = sign( fi(x) - 1/2 )

answered May 29 '12 at 15:25

Yisong%20Yue's gravatar image

Yisong Yue
58631020

So h(x) is the strong classifier, isn't it? How it can make new decision tree which represent strong classifier? Does h(x) change the dataset specially on the class label to make new strong decision tree?

(May 30 '12 at 00:02) tiopramayudi

That's right, h(x) is the strong classifier. The strong classifier is not a single decision tree, but a combination of the outputs of multiple decision trees. This strong classifier can, in principle, be represented as a single decision tree, but it's not clear to me how to efficiently compute that single decision tree. You can do it with exhaustive search over the input space, but that's hardly efficient.

I'm not sure what you mean by "change the dataset to make new strong decision tree". The changing of the dataset in Adaboost is a reweighting of the datapoints when learning each individual decision tree. I'm don't know how to efficiently convert h(x) into a single decision tree, so maybe this part of your question is vacuous?

(May 30 '12 at 00:34) Yisong Yue

you're right, my question actually how to make decision tree from h(x) . You said I can do exhasutive search, with that method I can convert h(x) into decision tree, doesn't it? Thx you

(May 30 '12 at 01:20) tiopramayudi

More generally, I think what you're interested in is model compression:
http://www.niculescu-mizil.org/papers/rtpp364-bucila.rev2.pdf

(May 30 '12 at 22:04) Yisong Yue

Thank you, I have one more question about adaBoost. Is it prediction from AdaBoost always better than single classifier, if it isn't what are the factors that makes it worse?

(Jun 01 '12 at 00:02) tiopramayudi

AdaBoost can sometimes overfit to the training set. Usually, people halt the adaboost procedure using a validation set. When the performance of the strong classifier stops improving on a held-out validation set not used for training, then you stop the adaboost procedure.

(Jun 01 '12 at 16:23) Yisong Yue

There are, however, other boosting algorithms that are known to overfit less than adaboost.

(Jun 02 '12 at 18:50) Alexandre Passos ♦

I've been implemented that AdaBoost algorithm, but the performance isn't improving.

I just realized that you wrote "fi(x) be the raw output (for input x) of each decision tree you've trained (this is typically a real valued output)". for decision tree I think the raw output is typically binomial not a real valued output. So I just transformed that binomial into numeric (-1,1). What do you think about that?

(Jun 04 '12 at 13:00) tiopramayudi

If the raw output is 0 or 1, then yes, just convert it to -1 and 1. I'm not sure why AdaBoost isn't working for you. Maybe the individual weak classifiers are too strong (that encourages overfitting)?

You could also try some alternatives, like random forests: http://www.cs.cornell.edu/~nk/fest/

(Jun 05 '12 at 17:46) Yisong Yue
showing 5 of 9 show all
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.