How can I find the gradient of X^TAY w.r.t. A?

asked Aug 31 '13 at 18:32

Behrang%20Mehrparvar's gravatar image

Behrang Mehrparvar
1222


One Answer:

I think a good rule of thumb is to go and compute the partial derivative w.r.t A_ij.

First, if X and Y are matrices then this is a matrix function and each partial derivative is a matrix, and your gradient is a 4-tensor. It can be worked out coordinate-wise but I'll assume this is not what you want.

If X and Y are vectors, this is a scalar function and your gradient is a matrix. The function can be written by noticing that X^T A is a vector, such that (X^T A)_i = sum_j x_j A_ij, and hence X^T A Y = sum_i y_i sum_j x_j A_ij. Then each A_ij is multiplying y_i x_j, so the gradient is X Y^T.

If what you want is the case where X and Y are matrices then you can repeat the argument above for each row of x or column of y and get the answer.

answered Aug 31 '13 at 23:03

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.