Always confusion with the transpose

Hello @Matthias_Kleine,

DLS Notation check first. If you check out the Standard notations for Deep Learning.pdf downloadable in this post, you will find the definition for X and W are:

X \in \mathbb{R}^{n_x \times m} where n_x is the input size.
W^{[l]} \in \mathbb{R}^{n^{[l]} \times n^{[l-1]}} where n^{[l]} is the number of units in layer l

So when we multiply them together, we only need W^{[1]}X without any transpose.

You can also see this in the video below

Raymond