What is the reason the W matrix to be written in an "inverted" way

In the “General imlpementation of forward propagation” lesson we are shown the W matrix defined in what seems to me an “inverted” way. What I mean by this:

For the first layer the w params are [1, 2], for the second - [-3, 4], and for the third - [5, -6]. I was very surprised to see the W matrix be defined as:

[[1, -3, 5],
[2, 4, -6]]

Isn’t it much more intuitive, both logically and code-wise, to be defined as:

[[1, 2],
[-3, 4],
[5, 6]]

And then you have - first row for first layer, second row for second layer etc.

Is there some significant reason that the data is represented in the other way where you have first column for first layer etc. or it’s just a personal preference?

1 Like

There is no universal standard for the orientation of the W matrix. It depends on the preference of who created the model.

@Let-ee, I agree that the W matrix implementation seems inverted compared to the vertical drawing of layer units. I’m glad you mentioned it.

If we accept this, I think the matrix math works out like this.

Do these matrix sizes look right? (I’m not worried about the sub-scripts and superscripts.)