I have a classification problem with 2 classes (positive and negative). Usually, in such classification problems, all the samples will be labelled either 'positive' or 'negative'. In my dataset, some of the samples possess a combination of both positive and negative characteristics. Formally, if the dataset is$x, then,

x = x1 U x2 U x3

where x1 is the set of all positive samples, x2 is the set of all negative samples and x3 is the set of samples that contain the characteristics of both the classes. As far as I could think of, this situation could be handled in 2 ways,

  1. Ignore x3 (samples that contain characteristics of both classes) and treat the problem as a traditional binary classification.
  2. Label the samples in x3 with both labels (positive & negative) and consider this as a multi-label classification problem.

I wish to follow the second option, as it is more natural choice. The reason being, ignoring some samples from the dataset gives me a feeling of manipulating the dataset artificially which may affect the performance of the classifier in the real world scenario. In this context, I have the following questions,

  1. Is it correct to treat this a multi-label classification problem. If so, is the intuition which is explained in the previous paragraph correct?
  2. Is there any other learning paradigm that can handle this scenario? If so, please provide reference to relevant literature.

asked May 18 '14 at 12:37

Annamalai%20Narayanan's gravatar image

Annamalai Narayanan
1448

edited May 18 '14 at 12:40


One Answer:

As is usually the case in ML, there are multiple ways to handle this.

Your intuition about it not being good to simply omit the x3 examples is correct.

Treating it as a three-class problem is one way to approach it.

My 1st inclination, however, would be to treat it as two two-class problems: (x1 vs not x1) and (x2 vs not x2). In this you would train the x1 classifier using (x1 U x3) as positive examples and x2 as negative examples. Then, you would train the x2 classifier using (x2 U x3) as positive examples and x1 as negative examples.

What works best will depend on the structure of the class distributions and which classifier you're using.

answered May 19 '14 at 14:14

Byron%20Dom's gravatar image

Byron Dom
1

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.