Problem understanding the shape of w

In the second week, we learn that in simple logistic regression, the weight matrix w has shape (n_x, 1). This means w has n_x rows and 1 column, so each feature weight is stored in its own row.

However, in the third week, second lecture, when we move on to hidden layers, we learn that the new weight matrix w has shape (n, n_x), where n is the number of neurons in that layer and n_x is the number of “features” (i.e., the number of neurons in the previous layer).

Towards the end of the video, the professor draws a network with 3 inputs, one hidden layer of 4 neurons, and one output neuron. If we focus only on the hidden layer and the output layer, it looks like logistic regression, with the hidden layer activations serving as inputs. From the second week’s perspective, I would expect the weight matrix to be (n_x, 1), which in this case would be (4, 1). However, the professor says it’s (1, 4). Why does this difference occur? It’s confusing me a lot.

Thanks!

It is a very loose standard, but generally a weight matrix size is outputs x inputs.

1 Like

Here’s a historical thread with some explanation of what is happening when Prof Ng defines the weight matrices in DLS C1 Week 3.

1 Like

Well, here is the course convention for “deep neural network classification” (as opposed to “single-layer network doing logistic regression”). I’m continually updating this diagram with observation gleaned from later parts of the course. Hopefully it’s informative.