Implementation of forward prop in numpy(https://www.coursera.org/learn/advanced-learning-algorithms/lecture/fZYiN/general-implementation-of-forward-propagation)

In the above video, I am confused by the way W is represented:


Here, we have represented W as a 2 by 3 matrix, with the weight vector of each neuron being the corresponding column. Why can we not make a 3 by 2 matrix instead, where each weight vector is now a row of the matrix? The code would then become w=W[j] instead of W[:,j].

I have just finished the lab immediately after this lecture and I made the above change and it seemed to work fine. Is there a particular reason we follow the above convention?

There is no universal standard for how the matrix is organized. It depends on the author’s preference, and the orientation of the training data in the ‘X’ matrix.

1 Like