|
I was thinking about minimization of FP in binary classification algorithms. In most methods only way for minimizing FP is setting high threshold for classification. For example in KNN you can find two nearest instances of X labeled as 0 and labeled as 1, and give output 1 only when distance between X and instance labeled as 1 is five times closer than distance between X and instance labeled as 0. Lets take another example: SVM. Here also you can setup threshold. Traditional SVM minimize generalization error, maximizing minimal distances between points and separating hyperplane. But what if I want to minimize false positives instead of maximizing accuracy? I can for example use genetic algorithm to setup hyperplane in such way that number of FP will be minimized (number of FP on training set will be the function that genetic algorithm will minimize). What are advantage of disadvantages of such method? Is there other way to minimize FP in KNN and SVM? |
|
Often it is possible to cope with a situation where you care more about FP than about recall by using modified loss functions. AFAIK this is possible in nearly all classification approaches. In the case of SVMs you can just set a class specific weight. All popular implementations support this. While I feel that it is a more principled approach to put what you care in you loss function, I don't really see what the problem with setting a higher threshold is. This is done very often to produce ROC curves and precision recall curves - which are meant to show how good a classifier is in a given precision setting. One last comment: I find it a little weird when you say you want to minimize false positives. That is trivially possible by always outputting the negative class. What you usually want is to say that a false positive has a much higher cost than a false negative and you somehow specify this cost in your loss function - or look at your precision recall curve to see how to set your parameters to get to the regime where you want to be. Cheers, Andy
This answer is marked "community wiki".
|
what are the pos:neg ratios in you train & test data?