In the first layer with 3 units, we expect W to have a size of (2,3) … .
Do I understand it correctly that W is a 2-D matrix with the size (2, 3) - 2 rows x 3 columns and it is because we have 3 units/neurons in the first layer each of which gets “fed” with a training data sample which have 2 features (temp and duration)? If correct, then why it is not of shape (3, 2)?
In the second layer with 1 unit, we expect W to have a size of (3,1) … .
Is the W in the second layer a 2-D matrix with the size (3, 1) and this is because it is an output of layer 1 (input to layer 2) that has 3 units? Why do we then have 1 column (not 2, for 2 features)?
Thanks, TMosh! One more question: why is the first weight matrix of the shape (2, 3) but not (3, 2), to keep it consistent with the shape of the second weight matrix)?
The first reply matches with how it is in the lab.
If I apply the 2nd comment to the neural network in this label,
shape is typically (outputs x inputs).
Layer 1 shape should be (3,2) because as output you get 3 activation values and input is of 2 features.
Layer 2 shape also should be (1,3) because you get 1 activation output value using 3 inputs
I see it intuitively as W matrix for Layer 1 as (3,2) and W matrix for Layer 2 as (1,3).
Please advise. Would really too sort this confusion out.
Thank you
It’s clear now. Prof. Andrew explains the parameter W shape in Section: Neutral Network implementation in Python. Video: General implementation of forward propogation.
TMosh, thank you.