1
1

I am working on a large multiclass SVM classification system using the one against all approach. It seems to be working fine according to 5-fold testing and independent set testing. However, in these tests I have simply been taking the highest positive value for each input as the prediction. I worry there could cases where there the second highest prediction could be near to the highest prediction. For example 1.6515952 and the next highest value is -0.99935411. This case is a good classification but what if I have first and second highest values that like these 0.59976528 -0.09927958 or even 0.59976528 0.09927958 there could be a point where the predictions are too close to be reliable. So far I am planning to normalize all the predictions for an input and only accept the prediction if the difference of the normalized values if greater than some value. But I have no idea what that value should be or if this is even a reasonable thing to worry about/solve this problem.

asked Jan 17 '13 at 16:20

Tyler's gravatar image

Tyler
16122


One Answer:

For multiclass SVM I don't know of any such probabilistic guarantees, but if you had something like a multiclass logistic regression (that is, something which predicts calibrated probabilities) then you can set a threshold and only accept answers which are true at least X% of the time by just thresholding the probabilities. There are techniques to produce calibrated probabilities from support vector machines, which would then allow you to do that (libsvm implements one).

answered Jan 23 '13 at 22:03

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.