This question has been asked twice in different ways, but I didn’t see a clear answer.

Question: Say there is a NN with L number of layers. There are 4 input features, and the first hidden layer has 5 neurons. In this case, what would be the dimension of W[1] i.e. weight matrix for weights between inputs to first layer?

My answer: I think the dimensions should be 4x5. This is because Z = W[1]T.X + b. Since X has 4 features and there are m examples, the dimensions of X would be 4xm. Thus, when W[1] has dimensions 4x5, then W[1]T will have dimensions 5x4, which is required for the dot operation.

However, in previous versions of this question as well as Week3 Quiz, the way to calculate dimensions of W is (number of neurons, number of input features) i.e. 5x4. This doesn’t add up and was wondering if someone could explain?

Thank you for your response. I checked out the notations document. It states the following:

X ∈ Rnx×m is the input matrix

W [l] ∈ Rnumber of units in next layer × number of units in the previous layer is the
weight matrix,superscript [l] indicates the layer

Based on the example I gave above where input has 4 features and first hidden layers has 5 neurons, the dimensions will be:

X - (4,m)

W[1] - (5,4)

Thus, W[1]T will have dimensions (4,5). Since for calculating Z (W[1]T.X + b), we need to take a dot product between W[1]T and X, the dimensions would then not match as number of columns in W[1]T should be same as number of rows in X. Am I missing something here?

I understand that it won’t be used, I was wondering if you could still clarify my confusion? That would help me work my way through backprop a little better. Please also see my reply to Mubsi’s comment.

Yes there was a transpose in Week 2, but that is a different case. If you think you are seeing a transpose in the forward propagation in Week 3 or Week 4, then I think you are just misinterpreting what you are seeing. I’ll bet it is the same slide that is discussed in this thread from a while back. Please have a look and see if that clears up things further.