1
1

Co-training with two classifiers trained using separate feature sets of the same data requires the feature sets to be conditionally independent given the labels. Why is this assumption necessary? What if it gets violated? Can I still use co-training if the both feature sets are not conditionally independent?

asked Jul 16 '10 at 21:35

spinxl39's gravatar image

spinxl39
3458104368

edited Jul 16 '10 at 21:35


2 Answers:

This assumption is necessary for the analysis. Intuitively, if they are not independent, the decisions made by the two classifiers are not independent, so you shouldn't be able to treat them as such to derive some confidence measure on the true labeling of the points for the semi supervised learning to necessarily help (and not just reinforce the bias of the classifiers). However, you can apply co-training without this independence, only it might not work that well (it will work more like a bootstrap method (as in an old semi-supervised technique common in NLP, where you start with a baseline labeling and gradually expand it using training classifiers, not in the Efron sense)), which is not as good as co-training but sometimes works.

answered Jul 17 '10 at 02:16

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1893744214333

The princple of co-training is using the independent of the pair of classifier trained by the response independent data set to help each other to get more labeled data. If the two data sets are not independent,the training approach is like self-training methods. The more independent of the both data sets, the better performance we will get.

answered Jul 17 '10 at 04:06

charlie's gravatar image

charlie
140121417

Your answer
toggle preview

Subscription:

Once you sign in you will be able to subscribe for any updates here

Tags:

×3

Asked: Jul 16 '10 at 21:35

Seen: 737 times

Last updated: Jul 17 '10 at 04:06

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.