Dimension of Weight Matrix

AAG · November 25, 2022, 8:48am

Hi,

This question has been asked twice in different ways, but I didn’t see a clear answer.

Question: Say there is a NN with L number of layers. There are 4 input features, and the first hidden layer has 5 neurons. In this case, what would be the dimension of W[1] i.e. weight matrix for weights between inputs to first layer?

My answer: I think the dimensions should be 4x5. This is because Z = W[1]T.X + b. Since X has 4 features and there are m examples, the dimensions of X would be 4xm. Thus, when W[1] has dimensions 4x5, then W[1]T will have dimensions 5x4, which is required for the dot operation.

However, in previous versions of this question as well as Week3 Quiz, the way to calculate dimensions of W is (number of neurons, number of input features) i.e. 5x4. This doesn’t add up and was wondering if someone could explain?

Mubsi · November 25, 2022, 9:30am

Hi @AAG,

Perhaps going over the Standard notations for Deep Learning, found here, will help you understand better.

Best,
Mubsi

Rashmi · November 25, 2022, 9:32am

Hello AAG,

From week 3 onwards, you don’t have to worry about transpositions as it’s already being done for you.

AAG · November 25, 2022, 6:58pm

Hi Mubsi,

Thank you for your response. I checked out the notations document. It states the following:

X ∈ Rnx×m is the input matrix
W [l] ∈ Rnumber of units in next layer × number of units in the previous layer is the
weight matrix,superscript [l] indicates the layer

Based on the example I gave above where input has 4 features and first hidden layers has 5 neurons, the dimensions will be:

X - (4,m)
W[1] - (5,4)

Thus, W[1]T will have dimensions (4,5). Since for calculating Z (W[1]T.X + b), we need to take a dot product between W[1]T and X, the dimensions would then not match as number of columns in W[1]T should be same as number of rows in X. Am I missing something here?

AAG · November 25, 2022, 6:59pm

Hi Rashmi,

Thank you for your response.

I understand that it won’t be used, I was wondering if you could still clarify my confusion? That would help me work my way through backprop a little better. Please also see my reply to Mubsi’s comment.

paulinpaloalto · November 25, 2022, 7:36pm

Right, the transpose won’t be used. The correct formula is:

Z^{[1]} = W^{[1]} \cdot X + b^{[1]}

No transpose in sight, right? Since we now agree that W^{[1]} is 5 x 4 and X is 4 x m, there is no problem with that dot product.

AAG · November 25, 2022, 7:50pm

Hey Paulin,

Thank you for your response.

Yes, I understand this now. Also, from completing the Week3 Programming assignment, I saw that it was implemented the way you wrote it.

However, as you may be aware, in the lectures the formula is different and the discrepancy is what has been throwing me off.

All clear now.

Thank you.

paulinpaloalto · November 25, 2022, 8:14pm

Yes there was a transpose in Week 2, but that is a different case. If you think you are seeing a transpose in the forward propagation in Week 3 or Week 4, then I think you are just misinterpreting what you are seeing. I’ll bet it is the same slide that is discussed in this thread from a while back. Please have a look and see if that clears up things further.

Topic		Replies	Views
Size of W matrix Neural Networks and Deep Learning coursera-platform	8	615	October 24, 2023
Understanding dimensions of matrixes when are are m sample with n features Neural Networks and Deep Learning week-3 , coursera-platform	3	233	March 15, 2025
Question regarding dimensions of w in logistic regression Neural Networks and Deep Learning coursera-platform	3	344	October 13, 2023
Problem understanding the shape of w Neural Networks and Deep Learning week-2 , week-3 , coursera-platform	4	34	April 10, 2025
Matrix Dimensions Confusion Neural Networks and Deep Learning coursera-platform	18	1526	November 29, 2022

Dimension of Weight Matrix

Related topics