Matrix Calculus

1004271927 · May 24, 2022, 3:59pm

I have some problems about Matrix Calculus：
1.I want to know that Is the Jacobi matrix by definition such or can it be expressed as a transpose

is it right in that way(please look this picture)

2.if it can be expressed as a transpose.I want to know that in such case’s answer,please look at this picture

Thanks very much

paulinpaloalto · May 24, 2022, 7:54pm

That is beyond the scope of this course. Here’s a thread which has some links that are relevant.

What you are probably asking about is the fact that because he doesn’t want to cover the derivation and all this matrix calculus material, Prof Ng takes a convenient shortcut that helps simplify translating Gradient Descent into code: he uses the convention that the gradient of a vector or matrix has the same shape and orientation as the base object. That makes writing the “update parameters” logic simple, but (as I think you are pointing out) that’s not really how things work if you really do the full mathematical version of all this. The “pure math” expression of all this is that the gradient of the object ends up being transposed from the shape of the base object.

anon57530071 · May 25, 2022, 5:53am

I suppose confusions came from how we define Nabla (or Del or Gradient), \nabla, which is sometimes represented by a row vector, and sometimes by a column vector. Let’s exclude Nabla, at first, and focus on Jacobian.

\textbf{y} = \begin{bmatrix} f_1(\textbf{x})\\ f_2(\textbf{x})\\ f_3(\textbf{x})\\ : \\ f_m(\textbf{x}) \end{bmatrix} = \begin{bmatrix} f_1(x_1, x_2, x_3, ...., x_n) \\ f_2(x_1, x_2, x_3, ...., x_n) \\ f_3(x_1, x_2, x_3, ...., x_n) \\ : \\ f_m(x_1,x_2,x_3, ...., x_n) \end{bmatrix}

Then, Jacobian is as you wrote,

\textbf{J} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \frac{\partial f_1}{\partial x_3} & ... & \frac{\partial f_1}{\partial x_n} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \frac{\partial f_2}{\partial x_3} & ... & \frac{\partial f_2}{\partial x_n} \\ \frac{\partial f_3}{\partial x_1} & \frac{\partial f_3}{\partial x_2} & \frac{\partial f_3}{\partial x_3} & ... & \frac{\partial f_3}{\partial x_n} \\ : \\ \frac{\partial f_m}{\partial x_1} & \frac{\partial f_m}{\partial x_2} & \frac{\partial f_m}{\partial x_3} & ... & \frac{\partial f_m}{\partial x_n} \end{bmatrix}

To understand what \nabla is, let’s simplify this with m=1. In this case, Jacobian can be written as follows. It is n-dimensional row vector.

f^{'}(\textbf{x}) = \begin{bmatrix} \frac{\partial f}{\partial x_1} & \frac{\partial f}{\partial x_2} & \frac{\partial f}{\partial x_3} & ... & \frac{\partial f}{\partial x_n} \end{bmatrix}

Then, gradient can be represented as \nabla f(x), which is;

\nabla f(\textbf{x}) = f^{'}(\textbf{x})^{T} = \begin{bmatrix} \frac{\partial f}{\partial x_1} \\ \frac{\partial f}{\partial x_2} \\ \frac{\partial f}{\partial x_3} \\ : \\ \frac{\partial f}{\partial x_n} \\ \end{bmatrix}

Yes, this is a column vector, not a row vector. But, many persons write this as a row vector or say does not matter. Actually, that may be right in most of cases. But, again, I suppose confusions come from here.
I believe, from a mathematical view point, if we want to describe Jacobian equation with \nabla, it would be;

\textbf{J} = \begin{bmatrix} \nabla f_1(\textbf{x})^{T} \\ \nabla f_2(\textbf{x})^{T} \\ \nabla f_2(\textbf{x})^{T} \\ .... \\ \nabla f_m(\textbf{x})^{T} \end{bmatrix} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \frac{\partial f_1}{\partial x_3} & ... & \frac{\partial f_1}{\partial x_n} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \frac{\partial f_2}{\partial x_3} & ... & \frac{\partial f_2}{\partial x_n} \\ \frac{\partial f_3}{\partial x_1} & \frac{\partial f_3}{\partial x_2} & \frac{\partial f_3}{\partial x_3} & ... & \frac{\partial f_3}{\partial x_n} \\ : \\ \frac{\partial f_m}{\partial x_1} & \frac{\partial f_m}{\partial x_2} & \frac{\partial f_m}{\partial x_3} & ... & \frac{\partial f_m}{\partial x_n} \end{bmatrix}

1004271927 · May 25, 2022, 11:46am

Yes,your oppion is so great.

Topic		Replies	Views
Explanation for derived gradients for LSTM back-prop? Sequence Models	3	678	September 6, 2021
Confusion in week 3 lesson for Backpropogation Derivations Neural Networks and Deep Learning week-3	3	16	September 23, 2024
Week 3, "Gradient Descent for Neural Networks" Neural Networks and Deep Learning week-3	10	471	March 25, 2024
Calculate the gradient with respect to a element of a matrix AI Discussions	10	124	July 24, 2024
Derivative of Z1 Neural Networks and Deep Learning week-4	9	255	February 24, 2025

Matrix Calculus

Related topics