Hi, I have a question on the lecture “Matrix Multiplication Rules” .

Viewing the topic wrt to NN, the Matrix “A” is transposed. Why this transpose required ?

Since we are viewing matrix **A** as activation values from previous layer and `A`

has 2 by 3 shape which means in the previous layer(may be Layer 1) from where A is derived, it has 3 units(neurons). 2 data points of course.

While the matrix with weights **W** has 2 by 4 shape. In this case shouldn’t this **W** matrix be of size 3 by 4 ?

3 rows signifies 3 weights for each activation vector from previous layer(layer 1) ?

Here 4 refers to 4 units in the current layer(may be Layer 2).

So Layer 2 weighted sum would be

Z = np.dot(A, W) + B

Here matrix B would of shape 3 by 4

Resulting matrix Z would of size - 2 by 4.

While the matrix multiplication itself is okay, but if it is mapped to Activation values and weights, I am not sure I understood on why does A needs to be transposed and why W is not of size 3 by 4.

Please kindly clarify.