Hi there,

For the two_layer_model, I kept getting this following error:

Cost after iteration 1: 0.6564026188409187

Cost after first iteration: 0.6950464961800915

Cost after iteration 1: 0.7239781229671559

Cost after iteration 1: 0.7239781229671559

Cost after iteration 1: 0.7239781229671559

Error: Wrong output for variable W1.

Error: Wrong output for variable b1.

Error: Wrong output for variable W2.

Error: Wrong output for variable b2.

Error: Wrong output for variable 0.

Cost after iteration 2: 0.6372332154479152

Error: Wrong output for variable W1.

Error: Wrong output for variable b1.

Error: Wrong output for variable W2.

Error: Wrong output for variable b2.

Error: Wrong output for variable 0.

2 Tests passed

2 Tests failed

â€śAssertionError: Not all tests were passed for two_layer_model. Check your equations and avoid using global variables inside the function.â€ť

I examined my whole code and donâ€™t think I used only the input variables, specifically, I used learning_rate as update_parameters(parameters, grads, learning_rate). On the other hand, for the L_layer_model, the same code pass test without error.

Can somehow help find out whatâ€™s wrong? Thanks!

Lu

Hey @Lu_Huang,

Welcome to the community. Can you please DM your code for the `two_layer_model`

function to me, so that I can help you figure out the issue.

Cheers,

Elemento

Just did, thanks a lot for the quick reply!

Hey @Lu_Huang,

You have used the `initialize_parameters_deep`

function instead of the `initialize_parameters`

function in your `two_layer_model`

function. The former function is for the L-layered model, while the later is for the 2-layered model. Also, when you are passing arguments to the `initialize_parameters`

function, you are supposed to pass them individually, whereas you have passed them as a tuple. I hope this helps.

Cheers,

Elemento

Oh, got it, thanks a lot for the help!

Wait, for the assignment we are dealing with here, initialize_parameters and initialize_parameters_deep have exactly same logic, is the difference because of random seed?

Hey @Lu_Huang,

If you check out the implementations of `initialize_parameters`

and `initialize_parameters_deep`

that have been imported and used in this assignment (*which you can view in the *`dnn_app_utils_v3.py`

file), you will find that in both the cases, the random seed is the same, and in fact, the implementations slightly differ which gives rise to this error.

Had you been using the functions that you implemented yourself in the previous assignment, the difference would indeed be of random seed, but in this case, it is not as I just mentioned before.

Feel free to check out the imported implementations and see the difference. Now, as to why this difference is done, perhaps the developers might have received better performance by the L-layered model by incorporating this difference. Once again, feel free to use your implemented functions, and see if this difference gives any advantage or not. Do share your results with the community.

Cheers,

Elemento

Exactly. It turns out that if you use the â€śplain vanillaâ€ť version of `initialize_parameters_deep`

that they had us build in the â€śStep by Stepâ€ť exercise, you get really terrible convergence for the 4 layer model here. So they used a more sophisticated algorithm called Xavier Initialization that we will learn about in Course 2 of this series, so stay tuned for that. I think they didnâ€™t mention this just because there is already enough new material to be learned in Course 1 and they didnâ€™t want to further muddy the waters. Well that and Iâ€™m guessing they didnâ€™t want to reveal that theyâ€™d given you the worked answers for the â€śStep by Stepâ€ť exercise as an â€śimportâ€ť file in the W4 A2 assignment. Of course if youâ€™d copied the â€śdeep initâ€ť code, it would have failed the tests in â€śStep by Stepâ€ť because the algorithm is different.

2 Likes

Hi there!

I am having the exact same error but in `L_layer_model`

section. My output is:

```
Cost after iteration 0: 0.6931489045172448
Cost after first iteration: 0.6931489045172448
Cost after iteration 1: 0.6930724756996511
Error: Wrong output for variable W1.
Error: Wrong output for variable b1.
Error: Wrong output for variable W2.
Error: Wrong output for variable b2.
Error: Wrong output for variable W3.
Error: Wrong output for variable b3.
Error: Wrong output for variable 0.
Cost after iteration 1: 0.6930724756996511
Cost after iteration 1: 0.6930724756996511
Cost after iteration 2: 0.6927503922903139
Error: Wrong output for variable W1.
Error: Wrong output for variable b1.
Error: Wrong output for variable W2.
Error: Wrong output for variable b2.
Error: Wrong output for variable W3.
Error: Wrong output for variable b3.
Error: Wrong output for variable 0.
2 Tests passed
2 Tests failed
AssertionError: Not all tests were passed for L_layer_model. Check your equations and avoid using global variables inside the function.
```

I tried changing the seed in `initialize_parameters_deep`

, but I havenâ€™t solved it yet. Any idea?

I just check the function in `dnn_app_utils_v3.py`

and there are a line that differs from the one I program (and I approved) in the 1st assignment:

```
def initialize_parameters_deep(layer_dims):
"""
Arguments:
layer_dims -- python array (list) containing the dimensions of each layer in our network
Returns:
parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
bl -- bias vector of shape (layer_dims[l], 1)
"""
np.random.seed(1)
parameters = {}
L = len(layer_dims) # number of layers in the network
for l in range(1, L):
parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) / np.sqrt(layer_dims[l-1]) #*0.01
parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))
assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))
return parameters
```

Why you divide by `np.sqrt(layer_dims[l-1])`

the parameters `WX`

? After doing the same I pass the test, but I donâ€™t understand why

thanks in advance

This was explained earlier on this thread. It turns out that the simple initialization algorithm that we built in the Step by Step exercise happens to give very poor convergence with the 4 layer model and the particular dataset that we have here. So they had to use a more sophisticated algorithm called Xavier Initialization, that we will learn about in Course 2 of this series. In that section Prof Ng will explain that the choice of initialization algorithm frequently matters and there is no single â€śsilver bulletâ€ť algorithm that works well in all cases. They didnâ€™t make a big deal about that here because this is the very first course in the series and there is just too much other material to cover. Stay tuned for DLS C2 to learn more!

Also notice that they did not tell you to â€śhand importâ€ť your functions from the Step by Step exercise. They also didnâ€™t make a big deal about that, but they gave you the functions. My theory is that they didnâ€™t want to reveal that theyâ€™d given you the solutions to the Step by Step exercise in this assignment. Although the tests would fail in Step by Step if you copied over the â€śdeep initâ€ť code from this assignment because (as explained above) the algorithm is different.

1 Like

Thanks for your reply. I understand that we will see later in the specialization the choice of initialization, but in the Notebook, it is explicitly said that we have to import our methods built (including `initialization_parameters_deep`

, attached a printscreen). However, the Grader that evaluates does so with a different criterion and causes a rejection of the code because we multiply `np.random.randn`

by `0.01`

instead of dividing by `np.sqrt(layer_dims[l-1])`

and if we donâ€™t dive over `dnn_...v03.py`

file, we can not get the origin of the error. Maybe I understood wrong, and Iâ€™m sorry if thatâ€™s the case.

Cheers!

Well, I guess this is a little ambiguous, but note that they just told you to â€śuseâ€ť (meaning call) those functions. They didnâ€™t tell you to copy or import them. If you had simply called them, everything would have been fine, because they already imported them for you. Note that in no exercise up to this point has it ever been required that we create a new code cell from scratch (Insert Cell), which is obviously required if you want to â€śhand importâ€ť your Step by Step functions.

2 Likes

You are right! My bad! Thanks again