Dear Group,

I was reading with Naïve Bayes Classifier coding, and interacted two portals one is NLTK (the standard one in http://code.google.com/p/nltk/source/browse/trunk/nltk/nltk/classify/naivebayes.py ) and another one in http://ebiquity.umbc.edu/blogger/2010/12/07/naive-bayes-classifier-in-50-lines/

Going through NLTK code I felt I should classify my features first, and then train. But the latter one, seems training it first before classifying.

I am finding it logical to classify first and then train, but if you can suggest? Am I following them correctly?

If any learned members of the group can kindly help me out.

Best Regards, Subhabrata Banerjee.

asked Apr 02 '11 at 08:41

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

What do you mean by classifying the features?

(Apr 02 '11 at 10:39) Alexandre Passos ♦

Sir, I feel it means classifying features, i.e., in which label input belongs. Naive Bayes I feel calculates posterior probability. Am I right?

Best Regards, Subhabrata.

(Apr 02 '11 at 14:11) Subhabrata Banerjee

Naive bayes classifies documents, not features. Each class has a probability distribution over features that it uses to classify documents (the document picks the class which assigns to it the highest distribution).

(Apr 02 '11 at 14:13) Alexandre Passos ♦

6 Answers:

OK. Thanks for this update. But now my next question, if I start writing a code for it, should I calculate posterior first or training first? I think we would do posterior first then train train and then test?

Best Regards, Subhabrata.

answered Apr 02 '11 at 14:22

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

What do you mean by calculating the posterior? You should first train (and training means estimating the probability of a given word appearing in a given class) and then test (which means computing the posterior distribution over classes for each document).

(Apr 02 '11 at 14:23) Alexandre Passos ♦

Good quizzing. Posterior I feel is P(C|F) C->Class, F->Feature. Best Regards, Subhabrata.

answered Apr 02 '11 at 14:27

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

In naive bayes you don't estimate P(C|F) directly, you estimate P(F|C) and use bayes' theorem plus independence assumptions to derive P(C|F)

(Apr 02 '11 at 14:28) Alexandre Passos ♦

Thanks for taking so much time to almost tutor me up. Posterior is as as you suggested, calculated as Prior * likelihood. I will handle the questions why it is called Naive and all, asking so many questions may be disturbing your time. My question is finally is: (i) I am calculating Posterior first. (ii) Then training. (iii) Then testing. Is that fine? Best Regards, Subhabrata

answered Apr 02 '11 at 14:47

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

I don't understand how you can compu the posterior without training.

(Apr 02 '11 at 14:50) Alexandre Passos ♦

Thanks for answering again. Suppose, I calculate priorlikelihood, is not that coming to posterior? I thought that one is posterior. The value I will get for priorlikelihood are you suggesting I have to take MLE etc. and then comes posterior? If you can kindly let me know. Best Regards, Subhabrata.

answered Apr 02 '11 at 14:56

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

Precisely, but the likelihood involves the probabilities learned during training.

(Apr 02 '11 at 14:57) Alexandre Passos ♦

Surely sir. Thank you for your kind time. Bayes is solved, now I can ask you one short question on HMM? I got a person with so nice knowledge, and so prompt. May I ask? Best Regards, Subhabrata.

answered Apr 02 '11 at 15:02

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

Of course you may. Just open a new question, and I or someone else will reply eventually. Also, I'd be happy if you took some time to summarise this discussion so as to help people looking for answers in the future.

(Apr 02 '11 at 15:07) Alexandre Passos ♦

OK. Sure I am posting it. Do you want me to summarize this discussion and post in the room for future reference by other users? Best Regards, Subhabrata.

answered Apr 02 '11 at 15:10

Subhabrata%20Banerjee's gravatar image

Subhabrata Banerjee
16667

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.