Given the screenshot, I donβt understand how "the sums in the first two equations of (10) can be calculated as matrix multiplication of (πΜ βπ) of a shape ( 1Γπ) and πt of a shape ( πΓ2 ), resulting in the (1x2) array.

How is sum of equations equivalent to a matrix multiplication?

Thank you!

Think about what a matrix multiplication does: each element in the result is the dot product of one row of the first operand with one column of the second operand. A dot product is two steps: first you multiply the corresponding elements of each vector and then you add up the resulting products. That maps to what is happening in the mathematical sums in the formulas: each one is the sum of the products of corresponding elements.

1 Like

Thank you. So weβre adding the partial derivatives to get derivative of the loss function (dL/dW). And that translates to dot matrix multiplication of (Y^ - Y) * X^t.