Hi there,
For the two_layer_model, I kept getting this following error:
Cost after iteration 1: 0.6564026188409187
Cost after first iteration: 0.6950464961800915
Cost after iteration 1: 0.7239781229671559
Cost after iteration 1: 0.7239781229671559
Cost after iteration 1: 0.7239781229671559
Error: Wrong output for variable W1.
Error: Wrong output for variable b1.
Error: Wrong output for variable W2.
Error: Wrong output for variable b2.
Error: Wrong output for variable 0.
Cost after iteration 2: 0.6372332154479152
Error: Wrong output for variable W1.
Error: Wrong output for variable b1.
Error: Wrong output for variable W2.
Error: Wrong output for variable b2.
Error: Wrong output for variable 0.
2 Tests passed
2 Tests failed
“AssertionError: Not all tests were passed for two_layer_model. Check your equations and avoid using global variables inside the function.”
I examined my whole code and don’t think I used only the input variables, specifically, I used learning_rate as update_parameters(parameters, grads, learning_rate). On the other hand, for the L_layer_model, the same code pass test without error.
Can somehow help find out what’s wrong? Thanks!
Lu
Hey @Lu_Huang,
Welcome to the community. Can you please DM your code for the two_layer_model
function to me, so that I can help you figure out the issue.
Cheers,
Elemento
Just did, thanks a lot for the quick reply!
Hey @Lu_Huang,
You have used the initialize_parameters_deep
function instead of the initialize_parameters
function in your two_layer_model
function. The former function is for the L-layered model, while the later is for the 2-layered model. Also, when you are passing arguments to the initialize_parameters
function, you are supposed to pass them individually, whereas you have passed them as a tuple. I hope this helps.
Cheers,
Elemento
Oh, got it, thanks a lot for the help!
Wait, for the assignment we are dealing with here, initialize_parameters and initialize_parameters_deep have exactly same logic, is the difference because of random seed?
Hey @Lu_Huang,
If you check out the implementations of initialize_parameters
and initialize_parameters_deep
that have been imported and used in this assignment (which you can view in the dnn_app_utils_v3.py
file), you will find that in both the cases, the random seed is the same, and in fact, the implementations slightly differ which gives rise to this error.
Had you been using the functions that you implemented yourself in the previous assignment, the difference would indeed be of random seed, but in this case, it is not as I just mentioned before.
Feel free to check out the imported implementations and see the difference. Now, as to why this difference is done, perhaps the developers might have received better performance by the L-layered model by incorporating this difference. Once again, feel free to use your implemented functions, and see if this difference gives any advantage or not. Do share your results with the community.
Cheers,
Elemento
Exactly. It turns out that if you use the “plain vanilla” version of initialize_parameters_deep
that they had us build in the “Step by Step” exercise, you get really terrible convergence for the 4 layer model here. So they used a more sophisticated algorithm called Xavier Initialization that we will learn about in Course 2 of this series, so stay tuned for that. I think they didn’t mention this just because there is already enough new material to be learned in Course 1 and they didn’t want to further muddy the waters. Well that and I’m guessing they didn’t want to reveal that they’d given you the worked answers for the “Step by Step” exercise as an “import” file in the W4 A2 assignment. Of course if you’d copied the “deep init” code, it would have failed the tests in “Step by Step” because the algorithm is different.
2 Likes
Hi there!
I am having the exact same error but in L_layer_model
section. My output is:
Cost after iteration 0: 0.6931489045172448
Cost after first iteration: 0.6931489045172448
Cost after iteration 1: 0.6930724756996511
Error: Wrong output for variable W1.
Error: Wrong output for variable b1.
Error: Wrong output for variable W2.
Error: Wrong output for variable b2.
Error: Wrong output for variable W3.
Error: Wrong output for variable b3.
Error: Wrong output for variable 0.
Cost after iteration 1: 0.6930724756996511
Cost after iteration 1: 0.6930724756996511
Cost after iteration 2: 0.6927503922903139
Error: Wrong output for variable W1.
Error: Wrong output for variable b1.
Error: Wrong output for variable W2.
Error: Wrong output for variable b2.
Error: Wrong output for variable W3.
Error: Wrong output for variable b3.
Error: Wrong output for variable 0.
2 Tests passed
2 Tests failed
AssertionError: Not all tests were passed for L_layer_model. Check your equations and avoid using global variables inside the function.
I tried changing the seed in initialize_parameters_deep
, but I haven’t solved it yet. Any idea?
I just check the function in dnn_app_utils_v3.py
and there are a line that differs from the one I program (and I approved) in the 1st assignment:
def initialize_parameters_deep(layer_dims):
"""
Arguments:
layer_dims -- python array (list) containing the dimensions of each layer in our network
Returns:
parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
bl -- bias vector of shape (layer_dims[l], 1)
"""
np.random.seed(1)
parameters = {}
L = len(layer_dims) # number of layers in the network
for l in range(1, L):
parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) / np.sqrt(layer_dims[l-1]) #*0.01
parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))
assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))
return parameters
Why you divide by np.sqrt(layer_dims[l-1])
the parameters WX
? After doing the same I pass the test, but I don’t understand why
thanks in advance
This was explained earlier on this thread. It turns out that the simple initialization algorithm that we built in the Step by Step exercise happens to give very poor convergence with the 4 layer model and the particular dataset that we have here. So they had to use a more sophisticated algorithm called Xavier Initialization, that we will learn about in Course 2 of this series. In that section Prof Ng will explain that the choice of initialization algorithm frequently matters and there is no single “silver bullet” algorithm that works well in all cases. They didn’t make a big deal about that here because this is the very first course in the series and there is just too much other material to cover. Stay tuned for DLS C2 to learn more!
Also notice that they did not tell you to “hand import” your functions from the Step by Step exercise. They also didn’t make a big deal about that, but they gave you the functions. My theory is that they didn’t want to reveal that they’d given you the solutions to the Step by Step exercise in this assignment. Although the tests would fail in Step by Step if you copied over the “deep init” code from this assignment because (as explained above) the algorithm is different.
1 Like
Thanks for your reply. I understand that we will see later in the specialization the choice of initialization, but in the Notebook, it is explicitly said that we have to import our methods built (including initialization_parameters_deep
, attached a printscreen). However, the Grader that evaluates does so with a different criterion and causes a rejection of the code because we multiply np.random.randn
by 0.01
instead of dividing by np.sqrt(layer_dims[l-1])
and if we don’t dive over dnn_...v03.py
file, we can not get the origin of the error. Maybe I understood wrong, and I’m sorry if that’s the case.
Cheers!
Well, I guess this is a little ambiguous, but note that they just told you to “use” (meaning call) those functions. They didn’t tell you to copy or import them. If you had simply called them, everything would have been fine, because they already imported them for you. Note that in no exercise up to this point has it ever been required that we create a new code cell from scratch (Insert Cell), which is obviously required if you want to “hand import” your Step by Step functions.
2 Likes
You are right! My bad! Thanks again