Matrix Calculus

paulinpaloalto · May 24, 2022, 7:54pm

That is beyond the scope of this course. Here’s a thread which has some links that are relevant.

What you are probably asking about is the fact that because he doesn’t want to cover the derivation and all this matrix calculus material, Prof Ng takes a convenient shortcut that helps simplify translating Gradient Descent into code: he uses the convention that the gradient of a vector or matrix has the same shape and orientation as the base object. That makes writing the “update parameters” logic simple, but (as I think you are pointing out) that’s not really how things work if you really do the full mathematical version of all this. The “pure math” expression of all this is that the gradient of the object ends up being transposed from the shape of the base object.

Topic		Replies	Views
Explanation for derived gradients for LSTM back-prop? Sequence Models	3	678	September 6, 2021
Confusion in week 3 lesson for Backpropogation Derivations Neural Networks and Deep Learning week-3	3	16	September 23, 2024
Week 3, "Gradient Descent for Neural Networks" Neural Networks and Deep Learning week-3	10	471	March 25, 2024
Calculate the gradient with respect to a element of a matrix AI Discussions	10	125	July 24, 2024
Derivative of Z1 Neural Networks and Deep Learning week-4	9	257	February 24, 2025

Matrix Calculus

Related topics