In the concluding segment of the video on “Vectorizing Multiple Examples” in Week 3 of Course 1, shouldn’t the vertical index for Z^[1] still be indexing the number of hidden units in the first hidden layer rather than the number of input units in the input layer?

Assume there are 3 input features - x1, x2, x3, 4 hidden units in the first hidden layer, and m training examples. Z^[1] should then be of the dimensions, 4 x m, where 4 corresponds to the number of hidden units in the first hidden layer, and not 3 x m, no?

Hi Jason, if you wish you could paste in a screenshot of the slide that is giving you some pause. But yes, Z^{[1]} should be 4 \times m, where 4 is the number of outputs from the input layer (W^{[1]} is 4 \times 3 and X is 3 \times m).

Hi Kenb, thanks for your response. My clarification was over what Prof Andrew narrated in the course of that lecture video and it was not explicitly demonstrated on the slide. Does your “number of outputs from the input layer” refer to the number of hidden units in the first hidden layer? There are only 3 input layer units, x1, x2 and x3. Would like to clarify that the dimensions of Z^[1] should be no. of hidden units in [1] x number of training samples?

Yes. Suppose that we have your three inputs (x1, x2, and x3) and the first layers has four (hidden) units:

Z^{[1]} = W^{[1]}X + b^{[1]}

We know the dimension of X, i.e. A^{[0]}, is 3 \times m. We further know that we want 4 (hidden) units in layer-1 – that’s aour design choice in this example. So Z^{[1]} must be 4 \times m. Which implies that W^{[1]} must be 4 \times 3. The bias term b^{[1]} must have 4 rows, i.e. a 4 \times 1 vector the NumPy broadcasts into a 4 \times m matrix (all columns identical).