W3_A1_Ex-2_Layer_sizes_Dimension_of_output_layer

fireping668 · June 12, 2021, 1:15pm

It is about Exercise 2 - layer_sizes. The instruction claims that n_y: the size of the output layer. I am able to do the exercise, but I do not understand why the size of the output layer (n_y) is equal to 2. It is a binary classification and the outputs should be either 0 or 1. Why the size of the output layer can be 2? I do not understand this and cannot visualize it. Also, one the top (in Exercise 1), the instruction also claims that the shape of Y is (1, 400). Why the size of output layer is not 1? Can anyone help? Thanks so much!

albertovilla · June 13, 2021, 8:26am

Hi, n_y is 1 for the planar dataset, however, it is 2 for the test case used after the layer_sizes function. As you can see, the layer_sizes function is applied to t_X and t_Y, not to X and Y. You can check the result when you applied to X and Y.

t_X, t_Y = layer_sizes_test_case()
(n_x, n_h, n_y) = layer_sizes(t_X, t_Y)

Toma_Tacescu · June 28, 2023, 5:45am

This is quite confusing (why use random data, not mentioned before) and it does not quite work “AssertionError: Wrong result. Expected (7, 4, 5) got (5, 4, 2)”

saifkhanengr · June 28, 2023, 5:50am

If it doesn’t work, it means your code is not correct. Please share your full error.

Toma_Tacescu · June 28, 2023, 6:20am

I was hard coding the numbers because how confuse (yet simple) this exercise is.
The scope of work should be better explained here.
Several people had similar issues.

ElhamS · July 28, 2023, 9:43am

what are t_X and t_Y? How are they related to X.shape and Y.shape?

paulinpaloalto · July 28, 2023, 5:18pm

They aren’t related. Those are just test case values generated by calling that function. They are then passed to your function and the answers are checked.

Within the body of your layer_sizes function, you simply reference the shapes of your input parameters in order to determine the dimensions.

It’s always a mistake to reference global variables within the body of one of our functions here in these courses, if that’s what you are asking.

ElhamS · July 29, 2023, 10:38am

I am not sure if I understand. In Ex.4, t_X, parameters = forward_propagation_test_case() are just test cases for function forward propagation. In this test case t_X.shape=(2,3) and parameter[W1].shape =(4,2). So obviously I receive error trying to create Z1=np.dot(W1.T,t_X)+b1. It seems forward_propagation_test_case() are not created correctly.

saifkhanengr · July 29, 2023, 10:56am

Remember this rule: To multiply two matrices, the number of columns in the first matrix must match the number of rows in the second matrix.

So, we have W1 (first matrix) and t_X (second matrix). Do you think we need to transpose any of them to match the above rule?

ElhamS · July 29, 2023, 12:52pm

Ok, so I arranged the matrices in the np.dot and no error in Z1 and Z2 calculation. but then in Ex.4 my A2=[[0.21442387 0.21436745 0.21442857]] which is not equal to the expected value A2=[[0.21292656 0.21274673 0.21295976]]. What could be my mistake?

saifkhanengr · July 29, 2023, 1:12pm

It will be worth checking the equations again:

Z^{[1]} = W^{[1]} X + b^{[1]}

A^{[1]} = \tanh(Z^{[1]})

Z^{[2]} = W^{[2]} A^{[1]} + b^{[2]}

\hat{Y} = A^{[2]} = \sigma(Z^{[2]})

Use sigmoid for A^{[2]}.

ElhamS · July 29, 2023, 1:33pm

Thank you so much for your answer! I had defined A1 with sigmoid instead of tanh! It works well now.

saifkhanengr · July 29, 2023, 1:44pm

Great

paulinpaloalto · July 29, 2023, 2:31pm

You must have already fixed this if you passed the tests, but there’s another mistake there besides the transpose. Remember what I said in my earlier reply:

In the local scope of the forward_propagation function t_X is a global variable. You should never directly access it: instead reference the relevant parameter of the function, which is X in that case.

DrMicrobit · February 11, 2025, 11:09am

I agree with others reporting the same problem while solving Ex. 2: please clarify in the instructions that n_x and n_y should be dynamically determined from the shapes. I was totally focused on solving the flower dataset problem itself that I hardcoded 2,4,1 and got totally confused by the error message.

Alternatively, add a bit of logic to create non-2,4,1 test cases and an assert that informs the user that the return value seems to have been hardcoded (assert … = (2,4,1) ).

On the behalf of some learners coming after me: pretty please?

paulinpaloalto · February 11, 2025, 4:05pm

There is an important lesson here that you are missing. The goal is that we always want to write general purpose code. Why even have the layer_sizes routine if you’re just going to “hard-code” everything to (2, 4, 1)? We could just write one big function that does everything, but then for every problem we have to write everything from scratch instead of having modular support functions that are reusable to solve other problems. We’ll see the best example of this when we get to Week 4 of DLS Course 1, so please stay tuned for that.

Then in later courses (C2 and C4) we will learn to use TensorFlow, which is all written with general purpose functions that can handle any dimensions.

Topic		Replies	Views
Week 3: programming assignment: Exercise 2 - layer_sizes Neural Networks and Deep Learning coursera-platform	1	605	June 9, 2021
Size of layers in Course 1 Week3 Planar Data Class program assignment Neural Networks and Deep Learning coursera-platform	5	635	April 30, 2021
Class 3 Excercise 2 Neural Networks and Deep Learning coursera-platform	8	505	November 1, 2023
Week 3 Assignment Planar dataset, Bug found in excersice-1 Neural Networks and Deep Learning coursera-platform	1	374	August 26, 2023
Week 3 Assignment - Exercise 2: Size of the layers Neural Networks and Deep Learning coursera-platform	5	621	December 19, 2022

W3_A1_Ex-2_Layer_sizes_Dimension_of_output_layer

Related topics