In the introduction to deep learning course, in the vectorization section isn’t it supposed to be dZ.T * X and not XdZ.T
dw = XZ.T is as per the tutorial.
which means- dw = [x1(1)dz1 + x2(10 dz2…xm(m)*dzm]
isn’t it wrong or am i wrong
In the introduction to deep learning course, in the vectorization section isn’t it supposed to be dZ.T * X and not XdZ.T
dw = XZ.T is as per the tutorial.
which means- dw = [x1(1)dz1 + x2(10 dz2…xm(m)*dzm]
isn’t it wrong or am i wrong
Welcome to the community !
Let’s recap the dimension of each variable. (m: number of samples)
w = (n,1), X = (n,m), Z=(1,m), dZ=(1,m)
Ignoring 1/m for a dimension calculation,
dw = X\cdot dZ^{T} = (n,m)\cdot (m,1) = (n,1)
This is exactly same as the dimension of w.
So, I suppose
dw = \frac{1}{m}XdZ^{T} is correct.