Size of W in [9] in C2_W1_Lab02_CoffeeRoasting_TF?

in [9]:

  • In the first layer with 3 units, we expect W to have a size of (2,3) … .

Do I understand it correctly that W is a 2-D matrix with the size (2, 3) - 2 rows x 3 columns and it is because we have 3 units/neurons in the first layer each of which gets “fed” with a training data sample which have 2 features (temp and duration)? If correct, then why it is not of shape (3, 2)?

  • In the second layer with 1 unit, we expect W to have a size of (3,1) … .

Is the W in the second layer a 2-D matrix with the size (3, 1) and this is because it is an output of layer 1 (input to layer 2) that has 3 units? Why do we then have 1 column (not 2, for 2 features)?

Thanks ahead!

2 Likes

Each weight matrix connects two adjacent layers.

  • So the first weight matrix has the size of the number of input features and the number of hidden layer units.

  • The second weight matrix has the size of the number of hidden layer units and the number of outputs.

Thanks, TMosh! One more question: why is the first weight matrix of the shape (2, 3) but not (3, 2), to keep it consistent with the shape of the second weight matrix)?

If I recall correctly, the shape is typically (outputs x inputs). I haven’t checked the details on this assignment though.

Thank you for raising this question. I too am confused about it.
TMosh, I still am confused despite reading your responses.

These 2 responses seem contradictory:

  1. Size of W in [9] in C2_W1_Lab02_CoffeeRoasting_TF? - #2 by TMosh
  2. Size of W in [9] in C2_W1_Lab02_CoffeeRoasting_TF? - #5 by TMosh

The first reply matches with how it is in the lab.

If I apply the 2nd comment to the neural network in this label,

shape is typically (outputs x inputs).

Layer 1 shape should be (3,2) because as output you get 3 activation values and input is of 2 features.
Layer 2 shape also should be (1,3) because you get 1 activation output value using 3 inputs

I see it intuitively as W matrix for Layer 1 as (3,2) and W matrix for Layer 2 as (1,3).

Please advise. Would really too sort this confusion out.
Thank you :pray:

It’s clear now. Prof. Andrew explains the parameter W shape in Section: Neutral Network implementation in Python. Video: General implementation of forward propogation.
TMosh, thank you.

2 Likes