W3_A1_Input Layer_mismatched

When we load the dataset using `load_planar_dataset()`, it returns a tuple `(X, Y)`, where `X` is a numpy array of shape `(2, 400)` and `Y` is a numpy array of shape `(1, 400)`. This means that we have 400 examples in the dataset, and each example is represented as a vector of 2 features (the x and y coordinates of the input) and 1 output label.

Therefore, when we compute `n_x`, we need to use `X.shape[0]`, which gives us the number of features in the input, i.e., the size of the first dimension of `X`. In this case, `X.shape[0]` is equal to 2, which is why `n_x` should also be equal to 2.

Similarly, when we compute `n_y`, we need to use `Y.shape[0]`, which gives us the number of output labels, i.e., the size of the first dimension of `Y`. In this case, `Y.shape[0]` is equal to 1, which is why `n_y` should also be equal to 1. However, when using : n_x = X.shape[0]
n_h = 4
n_y = Y.shape[0] , it returns:
The size of the input layer is: n_x = 5
The size of the hidden layer is: n_h = 4
The size of the output layer is: n_y = 2
n_h is understandable. But why n_x=5 and n_y=2 (while as I discussed earlier, I expect them to be n_x=2 and n_y=1).
any help?

Hello @Hamid_Reza_Hamedi! I hope you are doing well.

I agree with your intuition about the shape of X and Y. However, the output you mentioned is actually the shape of t_X and t_Y, as indicated in the attached figure. I hope this clears up any confusion you may have had.

If you have any further questions or concerns, please donâ€™t hesitate to let me know.

Best,
Saif.

thanks, @saifkhanengr for the response. I see your point now.But still it is a puzzle for me. As far as I know, and it comes from the neural network figure in this exercise, n_x=n[0]= number of features in input=2; n_y= number of output layer nodes=1. This is against n_x=5 and n_y=2.
Essentially I do not see any relationship between the size of layers and t_X and t_Y as the sizes come from
n_x = X.shape[0]
n_y = Y.shape[0]
would be thankful if you help clarify this for me.

The point is that we are supposed to be writing general code here. The `layer_sizes` function should work for any shape of the inputs, right? It should not be â€śhard-codedâ€ť to assume that all problems have 2 features. So they wrote a test case that makes sure you didnâ€™t do any â€śhard-codingâ€ť by using different dimensions than those for the actual problem here.

This will be a continuing pattern throughout all these courses: itâ€™s a mistake to hard-code anything unless there is literally no choice and they explicitly tell you to (as in the case of `n_h` here).

1 Like

Hello Hamid!

I think you are mixing the X and Y of the `load_planar_dataset()` and the X and Y of the `layer_sizes(X, Y)`. Both are different.

The `X, Y = load_planar_dataset()` means that X and Y are our data with shape (2, 400) and (1, 400), respectively. We can use any notation for that, like X_train, Y_train (but to pass the assignment, you have to use X and Y).

The X and Y of `layer_sizes(X, Y)` is not the planar dataset, but the input to the `layer_sizes` function. Once we define that function, then it is not necessary to use the exact notation (like X and Y) to call this function. For example, we use `layer_sizes(t_X, t_Y)`, here input to the function is t_X and t_Y which shape is different from X and Y of the planar dataset.

These are the basic concepts of Python. If you are not familiar with it, I highly recommend taking any Python course to get familiar with it. Note: You donâ€™t need to become an expert in Python to complete the DLS, just become familiar.

Best,
Saif.

1 Like

thanks, @saifkhanengr and @paulinpaloalto . Clear now, got the point. sorry for asking about such basic issues!

Hello Hamid,

No need to apologize for asking basic questions. We all start somewhere, and itâ€™s important to have a strong foundation in the basics to build upon. Iâ€™m glad that your doubts are clear now, and if you have any more questions or concerns in the future, feel free to ask.

Best,
Saif.

1 Like

Hello Saif!

I also struggled with this! Must say your answer here is fundamental, I suggest to give this same answer any other times as it is very clear.

I am more than familiar with Python but still struggled for a few minutes, it is comprehensible that one confounds the X matrix and Y vector with X and Y arguments to the function, as it is of unintuitive!

I donâ€™t know whether this was built as so to try our understanding, if so, interesting proof.

Best,

Nicola