“Note that for any variable, foo and dfoo always have the same dimensions. That’s why, w and dw always have the same dimension. Similarly, for b and db, and z and dz, and so on.”
In the assignment, you can confirm this statement by printing the shape of W and dW, b and db, Z and dZ, A and dA (of the same layer). You will see every pair has the same shape but the one contain values and another one it’s derivatives.
There is no any explanation on why dZ[1] has a dimension (n[1],1) as same as Z[1], except just only this statement
“Note that for any variable, foo and dfoo always have the same dimensions. That’s why, w and dw always have the same dimension. Similarly, for b and db, and z and dz, and so on.”
Based on the ambiguous assumption, dZ[1] is defined to have a dimension of (n[1],1). So we have to arrange a element-wise multiplication between (W[2].T) (dZ[2]) and g[1]'(Z[1]), just to fulfill the requirement of having (n[1],1) dimension.
(W[2].T) (dZ[2]) is defined to calculate using dot product. In the same way, the element-wise multiplication between (W[2].T) (dZ[2]) and g[1]'(Z[1]) can be changed to using dot product as well.
May i have any idea to understand the abovementioned statement?
Yes, i noticed that this is true when i tried it in the assignment. But i find that it is quite coincident so that i am looking for an idea of understanding the theory behind it.
The dimension of Z depends on the dimension of W and X, right? And the dimension of X depends on the number of features and number of examples. Moreover, the dimension of W depends on the number of neurons and number of features. Prof. Andrew explain the dimensions deeply in one of his video.
Yes, every term is clear except this dZ[1] = (W[2].T) (dZ[2]) * g[1]'(Z[1]).
According to the lecture, there is an assumption mentioning that dZ[1] has a same dimension as Z[1], So we have to arrange a element-wise multiplication (instead of dot product) between (W[2].T) (dZ[2]) and g[1]'(Z[1]), just to fulfill the requirement of having (n[1],1) dimension.
The dimension of dZ[1] depends on the dimension of W[2].T , dZ[2] ,and g[1]'(Z[1]).
This is the derivative calculation. To understand this equation, you have to be familiar with Calculus. There are many posts related to this topic. You can read this and this and yet this one.