In the examples I've seen, the data points always look somewhat like http://abel.ee.ucla.edu/cvxopt/examples/book/figures/fig-7-1.png How about if I have a binary classification problem, but the data points are more scattered? Will the logistic curve end up having shape to account for it, or it always have this S - shape? Also, what if what I need ends up being kind of like inverted S?

asked Jan 27 '11 at 12:38

Viktor%20Simjanoski's gravatar image

Viktor Simjanoski
163152024


2 Answers:

The data points don't look like that figure at all: that figure is a plot of the (pre-bias) score assigned by the classifier (x axis) versus actual class of each point (y axis). The s is there just for you to see the actual probability of the positive class for each pre-bias score. A logistic regression classifier is essentially linear: each example is represented by D features, and for each feature you assign a positive or negative weight w. Then, the score of each example is s(x) = sum_i f_i w_i, and to compute the probability of the positive class you learn a bias term b and predict P(class=1) = 1/(1+exp(b-s(x))). Hence the S curve is actually computed from all the features, and can be flipped and stretched at will by changing the w for specific features. As the features don't have to be normally distributed, even looking at a single dimension will probably not get you an s curve like in the picture.

In other words, logistic regression defines a hyperplane classifier, and the s curve is involved only in estimating the membership probabilities given how far each point is to the hyperplane. Most datasets don't look like that unless you plot this "distance from the plane" value against the class of the points.

answered Jan 27 '11 at 16:35

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1896744214334

Thanks, I got it.

(Jan 28 '11 at 03:06) Viktor Simjanoski
-1

To some extent, the logistic curve can be stretched (by adjusting the coefficients) or shifted (by adjusting the constant). Still, some situations call for other forms, so there are alternatives like extreme value regression, etc.

If the graph runs the other way (descends instead of ascends), then the coefficient for that dimension has the opposite sign.

answered Jan 27 '11 at 15:58

Will%20Dwinnell's gravatar image

Will Dwinnell
312210

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.