Hi everyone,

I am having some difficulty understanding one step of the EM algorithm. Actually it is just a precise part that I can't seem to figure out. I will try to expose my problem clearly: I am OK with the E-step that maximize the second line of

alt text

and leads to alt text.

However in the M-step, the expression to maximize in order to find theta is link text.

If the distributions factorizes over the examples, this can be rewritten as alt text.

But in the case of the mixture of Gaussian for example, the term to be maximized looks more like link text.

It seems to me that the expressions of Q(theta, theta-old) are ... different. My question is thus how do I find the expression for the Mixture of Gaussian (derived from the general form of Q(theta, theta_old))?

For information I use Bishop's book and my problem is to find the missing link between eq. 9.74 and 9.40. Thank you for your help!

asked May 16 '12 at 06:10

Arnaud's gravatar image

Arnaud
15335

edited May 16 '12 at 22:04

check wikipedia for em example on gmm: http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm#Example:_Gaussian_mixture

(May 16 '12 at 13:50) rm9

Thanks rm9, I read the Wikipedia article again but it seems to me that they assume from the start the Gaussian form of Q(thetq,theta_old) and do not derive it from the general version. My problem is these have two different shapes (summation and product) and I find it not trivial to go from one to the other. PS: I edited the Gaussian form in my post above which contained an error I think.

(May 16 '12 at 22:10) Arnaud

seems to me like you are just missing a log function (using the log-likelihood instead of the likelihood as you did) that converts the multiplication into a sum.

(May 17 '12 at 02:33) rm9

Hum, I don't think so, the log is already in both expressions. Did you try to transform the general expression into the one for the Gaussian case? I think there is a little bit more work than just transforming the log of a product into the sum of logs. Howver if you did it I would very much like to see your method, maybe there is just something I didn't see.

(May 17 '12 at 04:12) Arnaud

just search "em for gmm" in google, there are tons of examples of the formulation.

(May 17 '12 at 04:53) rm9

well, yes that is an answer. I can just figure it out myself. I actually followed your advice and try to find the answer myself in the literature. Surprisingly few documents do the derivation, most of them like Wikipedia assume the form. And I find it actually a little bit tricky to find derive it properly. http://nic.schraudolph.org/teach/ml03/gentleEM.pdf gives the derivation, slides from Bengio also do it though differently.

(May 17 '12 at 06:18) Arnaud
showing 5 of 6 show all

2 Answers:

I post here fairly detailed solution appearing in http://nic.schraudolph.org/teach/ml03/gentleEM.pdf It might be of interest to other people. Here is the part of interest (notation are different from those used in my first post) extracted from the pdf: alt text

answered May 17 '12 at 06:21

Arnaud's gravatar image

Arnaud
15335

edited May 17 '12 at 09:01

I've personally never cared much for Bishop's explanation of the EM algorithm, in my own experience, I think Andrew Ng explains it better.

Here and here you have a pretty good derivation.

answered May 19 '12 at 21:30

Leon%20Palafox's gravatar image

Leon Palafox ♦
40857194128

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.