Confusing about structure of matrices X and W

In video ‘Neural Network Presentation’ of week 3, W[1] is shown as a (4, 3) matrix and b as (4,1) vector. From my understanding, this means that each row is equal to the parameter vectors of the first layer, and that w[2]_3 is the row vector of parameters of the fourth layer and third neuron.

Apparently (see screenshot), this is not the case. So can anyone help with the definitions of W and perhaps X as well? If I remember correctly, in one of the videos it’s mentioned that features are stored as columns (which implies that each example is stored as a row), but in practice it seems that each example is stored as a column instead.

Any help or guidance would be appreciated, thanks!


  1. 1 ↩︎

  2. 4 ↩︎

Hi @DeMann,

Perhaps going over the Standard notations for Deep Learning, found here, will help you understand better.

Best,
Mubsi

1 Like

Hello DeMann,

Prof Ng has used W to indicate the weights as per neurons w.r.t the layers, for instance, the format W1[1] is used as a column vector for the first neuron in the first layer, whereas X is the full sample matrix with each column indicating one input vector.

Besides, Mubsi has provided you with one of the resources that would help you in understanding the notations further.


  1. 1 ↩︎

Thank you Mubsi and Rashmi, all clear.