Week 3: Dissonance on section 4 picture vs layer_size test


I’ve passed the assignment but found myself disagree with test of the layer_size function.

A picture of the section 4 proposes to implement an NN with 1 output neuron, which means for me that n_y is 1. Such implementation fails the layer_sizes_test. Clearly, it requires n_y to be Y.shape[0] which contradicts the picture. Also, n_y > 1 doesn’t seem to work in the assignment since the cost function implementation fits binary classification only.

Please consider changing tests so n_y is required to be 1.


I think you are missing a fundamental point here: all the functions we are writing here are “general” in the sense they will work with any dimensions, not simply the specific dimensions of the actual data we are using in this exercise. Why would you want to write the layer_sizes function in a way that it ignores the input and hard-codes the size of the output layer?

Of course we have to hard-code the size of the hidden layer here, because there is no way to deduce that from the shapes of the inputs and outputs. If they were going to change anything about this, I would suggest they make the size of the hidden layer a parameter to the function and then there would be no need to “hard-code” anything.

Hi Paul,

I would certainly want to write a general function that is going to be used in the assignment. Can you agree that generic n_y > 1 is not used here since all the datasets contain only 2 classes and the only loss described in lectures so far (as I recall) is the binary CE?

Yes, you are correct that all binary classifiers have a single neuron in the output layer. But their point is to teach you how to write general code by using the “shape” attribute of numpy arrays. So what would it really buy us to hard-code the n_y value to 1? And now you’ve got a general purpose function that will still work fine if you decide to apply it to a multi-class classifier with a softmax output layer.

So what would it really buy us to hard-code the n_y value to 1?

Nothing, I agree.

Also, it seems to me that the excersise runs ahead of lectures since there are no multiclass & softmax defined yet. Therefore, please consider an option to move the excersise to a next assignment with multiclasses (I believe it exists).

Thank you!