|
My question is about the parameter updating in log-linear models. I'm trying to understand how to calculate the gradient of the log-likehood functions in Zettlemoyer and Collins, 2005. The gradient is (according to Zettlemoyer and Collins, 2005): EDIT: I don't know how to upload pictures, the equation is here. In the paper, they said that "Expectation of this type can again be calculated using dynamic programming, using a variant of the inside-outside algorithm". I want to see more details about this calculation. It would be best if you could provide some implementation codes of this calculation. Thanks! |