|
This has be asked before, but I still have not grasped it completely. I know that generative models model the feature distribution and that this includes modelling the P(x|y) and P(y), which are not required if we are trying to classify (find P(y|x)). Question: Many text books say that it is easier to include features in discriminative models which is rarely explained. Also they mention that discriminative models allow overlapping features (features that are interdependent). Could anybody explain what this means and why is it true, or guide me to the place to read? I see that it is possible to include features in generative models as well, and I can't see why it should be easier or more efficient in discriminative models. |
|
OK this is only a partial answer based on what I understand so far:
Happy to here some thoughts on this |
which text books are you talking about?
Machine Learning: A Probabilistic Perspective. This one just mentions on page 268 that discriminative models handle feature preprocessing, unlike generative models, by replacing x with f(x) in the model, which I can't figure out why is it not possible in generative models. There are many other examples to list. Like Jebara's PhD thesis which is really good but I could not answer this question after reading it http://www.cs.columbia.edu/~jebara/papers/jebara4.pdf. Also, I have taken a look at this book http://research.microsoft.com/en-us/um/people/cmbishop/prml/ and did not work for me as well.