A simple question about terminology. In many types of classifiers (logistic regression, SVMs, neural networks) one classifies by first computing a "soft" real-valued function f(x). In a binary setting, f(x) would be very high if there is a high probability that y=1 and very low if there is a high probability that y=-1. Now, my question is, is there a name for f(x)? In the svm setting, it would be reasonable to call it a "margin", but that doesn't seem to fit for other types of classifiers.

asked Nov 01 '10 at 14:37

John%20Southland's gravatar image

John Southland
21113

1

I'd just like to say thanks for asking for the term, rather then using your own. Too much of machine learning suffers from having people re-invent terms for common things, making it a difficult area to search in.

(Nov 01 '10 at 20:58) Robert Layton

3 Answers:

In SVMs this is called the margin. In logistic regression this is called the log-odds of the positive class. Usually, if you say "score" or "margin" people will understand, I think.

answered Nov 01 '10 at 19:03

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1896744214334

"Score" may be ambiguous with the output of the classifier.

(Nov 01 '10 at 19:49) rm999

@Ravi Moody: how exactly? Isn't he talking precisely about the output of the classifier before tresholding? In structured learning this is usually referred to as a score (or probability, or energy).

(Nov 01 '10 at 19:50) Alexandre Passos ♦

In my field (predictive analytics) "scores" are the output of a classifier e.g. credit scores attempt to predict credit default. Sounds like it has become a fairly overloaded term, sadly.

(Nov 01 '10 at 21:52) rm999

Alright, "margin" is the least-offensive option, I guess!

(Nov 03 '10 at 13:30) John Southland

I think the word "score" is the most immediately understandable as the output of any continuous valued predictor. You want positives to score high and negatives to score low.

Common credit scores like FICO are very close to being linear in log-odds. The use of "score" for the equivalent quantity in logistic regression (original poster's f(x)) is justifiable as the same, not just analogous, usage.

In practice credit scores are calibrated to a user-friendly scale but that's unimportant here.

If a score is calibrated to true log-odds, I call it "the logodds". Why make life difficult?

I prefer "score" to "margin", as I think it's more generic. If I heard "SVM score", I would know what it means, but if I heard "Naive Bayes or regression margin" it wouldn't be immediately obvious what that meant.

answered Nov 04 '10 at 18:44

Matt's gravatar image

Matt
12

edited Nov 04 '10 at 18:48

I don't see anything wrong with calling it a "classifier". Classifiers do not have to be binary; they can be real-valued, where the value gives a probability or confidence of the binary prediction.

answered Nov 01 '10 at 15:03

rm999's gravatar image

rm999
6125

Maybe I should clarify. I am trying to find a term to refer to the number. For example, in a logistic regression, if we have f(x)=0, I could say there is a 50% probability of y=1 and a 50% probability of y=-1. However, that is a statement not about the quantity f(x), but about the quantity 1/(1+exp(-f(x))). I would like to somehow directly discuss f(x), so I needn't commit to this probabilistic interpretation.

(Nov 01 '10 at 15:43) John Southland

OK I see what you mean. You are trying to separate out the intermediate step, e.g. in logistic regression you want the raw value of the linear function before passing it through the logistic activation function.

I've never heard of a general term for this. I don't think there should be one because that value has a different meaning and purpose in different classifiers. In general I consider that final step to the intermediate value an integral part of the classifier, i.e. not something that should be considered a modular attachment to it. In generalized linear models I believe it is called the "linear predictor".

(Nov 01 '10 at 16:19) rm999
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.