Hi, I have a question on the lecture “Matrix Multiplication Rules” .
Viewing the topic wrt to NN, the Matrix “A” is transposed. Why this transpose required ?
Since we are viewing matrix A as activation values from previous layer and A
has 2 by 3 shape which means in the previous layer(may be Layer 1) from where A is derived, it has 3 units(neurons). 2 data points of course.
While the matrix with weights W has 2 by 4 shape. In this case shouldn’t this W matrix be of size 3 by 4 ?
3 rows signifies 3 weights for each activation vector from previous layer(layer 1) ?
Here 4 refers to 4 units in the current layer(may be Layer 2).
So Layer 2 weighted sum would be
Z = np.dot(A, W) + B
Here matrix B would of shape 3 by 4
Resulting matrix Z would of size - 2 by 4.
While the matrix multiplication itself is okay, but if it is mapped to Activation values and weights, I am not sure I understood on why does A needs to be transposed and why W is not of size 3 by 4.
Please kindly clarify.