|
Using any supervised classifier, we can usually get a probability of belong of a data-point x to each class yi, i.e. P(yi|x). However, in the case where the data-point x may belong to no one of the known classes, how can we get the probability P(?|x), which means that it x probably belong to an unknown class with proba P(?|x) ? cross-posted at http://stats.stackexchange.com/questions/52580/how-can-we-get-the-confidence-or-ptobability-that-the-data-point-belongs-to-an |
|
This question has remained open for a while probably because it's tricky: there is no general-purpose way of doing so without making strong assumptions about the data. As @Will Kurt said in the comments, one option is to represent each class you have as a one-class SVM, and say a point belongs to a new class if no one-class SVM triggers. The problem with one-class SVMs is that they are hard to train in an online fashion. Another approach is to treat this problem as multi-label classification instead of multi-class classification. Essentially, for each class train a predictor for whether the point belongs to that class or not. If more than one predictor returns "true" you need to break the tie somehow, and unfortunately there is no easy way of doing so with good guarantees and without making assumptions about how your classifiers work. If no single predictor returns true then you probably have something from a new class. You can train these predictors online with SGD, using as negative examples for each class all positive examples for other classes. Finally, you can use a nonparametric Bayesian model, and have a generative model for each class, which is updated as more points from that class are seen. Then it might be that your posterior distribution on a new point will prefer to assign it a new class. An example of this last approach is Unified analysis of streaming news, by Ahmed et al. Just a point of clarification, a "one-class" svm is actually an unsupervised learning technique used for novelty detection (you essentially try to put all of the data into one class). You wouldn't train one for each class, you would train one on the entire set of known data (without labels) and use it to detect novel information.
(Mar 26 '13 at 13:37)
Will Kurt
Yes, I know the one-class SVM is a support estimation technique. For his case in which he also wants classificaiton you could train a one-class model per class. It does not reduce to training a multiclass model, though, and is probably worst.
(Mar 26 '13 at 14:47)
Alexandre Passos ♦
@AlexandrePassos I tested the first method (i.e. "represent each class you have as a one-class SVM, and say a point belongs to a new class if no one-class SVM triggers"), ~95% of points that are from new classes are correctly detected as new which is good, however, more than ~50% of points that are not from new classes (inliers, from already existing classes) are incorrectly detected as being new (which is embarrassing). Is there anyway to fix this later problem ?
(Mar 28 '13 at 18:39)
shn
Use a better classifier / feature representation?
(Mar 29 '13 at 07:30)
Alexandre Passos ♦
|
This doesn't exactly answer your question but maybe it could point you in the right direction: have you looked into novelty detectors such as one-class SVMs? Rather than trying to do all of your work in the multi-class, supervised classifier, it might be worth exploring the use of an unsupervised novelty detector as a layer in your classification process.
@WillKurt (1) Can the novelty detectors (as one-class SVMs etc..) be trained incrementally ?
(2) is one-class SVMs an unsupervised novelty detector ?
(1) I haven't done any work with online SVMs, but I do believe there's research in that area.
(2) Yes it is unsupervised, essentially it tries to put all of your data into a single class. Parameter tuning is really important since it's easy to create a boundary around existing data that is either too forgiving of novel data or do unforgiving of information that belongs.
To address the online issue, even if it's not a one-class SVM, I think the direction you want to explore is online, unsupervised outlier/novelty detection.