|
Hi everyone, I am having some difficulty understanding one step of the EM algorithm. Actually it is just a precise part that I can't seem to figure out. I will try to expose my problem clearly: I am OK with the E-step that maximize the second line of
and leads to However in the M-step, the expression to maximize in order to find theta is If the distributions factorizes over the examples, this can be rewritten as
But in the case of the mixture of Gaussian for example, the term to be maximized looks more like It seems to me that the expressions of Q(theta, theta-old) are ... different. My question is thus how do I find the expression for the Mixture of Gaussian (derived from the general form of Q(theta, theta_old))? For information I use Bishop's book and my problem is to find the missing link between eq. 9.74 and 9.40. Thank you for your help!
showing 5 of 6
show all
|
|
I post here fairly detailed solution appearing in http://nic.schraudolph.org/teach/ml03/gentleEM.pdf
It might be of interest to other people.
Here is the part of interest (notation are different from those used in my first post) extracted from the pdf:
|
|
I've personally never cared much for Bishop's explanation of the EM algorithm, in my own experience, I think Andrew Ng explains it better. |
.
.
.
.
check wikipedia for em example on gmm: http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm#Example:_Gaussian_mixture
Thanks rm9, I read the Wikipedia article again but it seems to me that they assume from the start the Gaussian form of Q(thetq,theta_old) and do not derive it from the general version. My problem is these have two different shapes (summation and product) and I find it not trivial to go from one to the other. PS: I edited the Gaussian form in my post above which contained an error I think.
seems to me like you are just missing a log function (using the log-likelihood instead of the likelihood as you did) that converts the multiplication into a sum.
Hum, I don't think so, the log is already in both expressions. Did you try to transform the general expression into the one for the Gaussian case? I think there is a little bit more work than just transforming the log of a product into the sum of logs. Howver if you did it I would very much like to see your method, maybe there is just something I didn't see.
just search "em for gmm" in google, there are tons of examples of the formulation.
well, yes that is an answer. I can just figure it out myself. I actually followed your advice and try to find the answer myself in the literature. Surprisingly few documents do the derivation, most of them like Wikipedia assume the form. And I find it actually a little bit tricky to find derive it properly. http://nic.schraudolph.org/teach/ml03/gentleEM.pdf gives the derivation, slides from Bengio also do it though differently.