Is classification with an infinite number of class labels really equivalent to regression?

At least for one of the split criterion used for building a regression tree, when splitting a nominal attribute with many categorical values (please correct me if I am incorrect, at least this is the understanding I got from examining an implementation online), Breiman sorts each categorical value (note that this has no inherent ordering) by the sum of the binned target values. The probe for the split point then is equivalent to the probe used for continuous attributes, once the ordering is established this way.

I am wondering if the same sort of behaviors take place for the class labels themselves when their numbers go to infinity. Maybe there is some threshold for using regression vs classification.

asked Oct 08 '13 at 15:54

mlguru's gravatar image

mlguru
1111


2 Answers:

That's a good question.

For the particular case of logistic regression for multiple classes and linear regression, the bottom line is whether for a large sufficient number of classes the objective optimization functions have the same value.

I do not know if there is a proof of this, and I'm not sure there is, it would basically mean getting the softmax function to the infinity and show that they converge to the linear regression normal equations

answered Oct 09 '13 at 12:07

Leon%20Palafox's gravatar image

Leon Palafox ♦
40857194128

edited Oct 09 '13 at 12:07

Actually there is a relationship between class size and classifier performance. This has been studied a lot for the SVM learner. The gist goes like this:

SVM is good on small classes, but is slower than some other learners that perform as well SVM is not good on medium [1000:10000] class sizes SVM is good at large class sizes [10000+] and has higher precision than many other learners

So while this is not logistic regression per se, the kernel function approaches a problem very similarly and both use the gradient descent optimization heuristic so I would imagine that you would find similar results.

(Also this might be publishable? See what's out there already)

answered Oct 12 '13 at 13:34

rakirk's gravatar image

rakirk
31113

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.