Clarification on the reason of failed excesrise

C1W4A1: Building your Deep Neural Network: Step by Step

In the last exercise (no. 10)

``````parameters["W" + str(l+1)] = parameters["W" + str(l+1)] - learning_rate * grads["dW" + str(l + 1)]
parameters["b" + str(l+1)] = parameters["b" + str(l+1)] - learning_rate * grads["db" + str(l + 1)]
``````

passes both tests while

``````parameters["W" + str(l+1)] -= learning_rate * grads["dW" + str(l + 1)]
parameters["b" + str(l+1)] -= learning_rate * grads["db" + str(l + 1)]
``````

passes the first and fails the second test. arenâ€™t they the same? why does this happen?

``````In [1]: a = 1

In [2]: b = 2

In [3]: a -= b

In [4]: a
Out[4]: -1
``````

If this was the case, then `a` would become 1 instead of -1

I was waiting for a reply like this

Here the basic step in neural network is updating parameters - while improves the learning rate with the gradient descent.

My explanation of a = a - b was not coz of higher or lower value but to explain why your other image differs from each other

We are updating parameter while improving learning rate with grad function computes the sum of gradients of the outputs. So you first choose the parameter and then apply learning rate which a hyperparameter that controls the weights of our neural network with respect to the loss gradient.

So that is why it should a = a - b and need to be defined as a = a - b and not because 2- 1 or 1-2

Sorry, I donâ€™t understand.

if you follow your image of a = b - a, you will not get the expected output in simpler terms, your update parameters can have opposite effect/result than the expected output.

see this image just above the grader cell you shared where it mention while we are updating the parameter, we are subtracting or derivative of the parameter. So you cannot update your parameters first with partial derivative of parameters but first with the updated parameter - your partial derivative of parameter

I believe `a -= b` is equivalent to `a = a - b`.

Itâ€™s the same reason that `i += 1` is equivalent to `i = i + 1`.

Please read this thread to understand why that is. You need â€ścopyâ€ť or â€śdeepcopyâ€ť to break the connection to the global variables.

Yes, thatâ€™s correct. That is the meaning of â€ś-=â€ť from an arithmetic p.o.v. But whatâ€™s really going on here is more complicated than that. The numeric result is the same, but if the operands in question are â€śobjectsâ€ť (meaning pointers) then the way memory is managed is very different. The â€ś-=â€ť operator is â€śin placeâ€ť, meaning that it directly modifies the object in memory. The plain assignment allocates a new memory object for the RHS, so the original variable is not modified in memory. If the variable in question is an object passed as a parameter to a python function and you didnâ€™t first â€ścopyâ€ť it, then youâ€™ve modified global data with the â€ś-=â€ť approach. The test cases for this exercise are written in such a way that modifying the global data causes subsequent tests to fail.

1 Like

When I explained this it was more related to updating parameters and not the integers

The `copy()` had actually happened before the â€śfor loopâ€ť in which the updating happens and caused the confusion.

Here is the full code:

``````def update_parameters(params, grads, learning_rate):
"""

Arguments:
params -- python dictionary containing your parameters

Returns:
parameters -- python dictionary containing your updated parameters
parameters["W" + str(l)] = ...
parameters["b" + str(l)] = ...
"""
parameters = params.copy()
L = len(parameters) // 2 # number of layers in the neural network

``````

{moderator edit - solution code removed}

I was able to fix it by deep copying the `params` using `parameters = copy.deepcopy(params)`

Here is the full code:

``````# GRADED FUNCTION: update_parameters
import copy

"""

Arguments:
params -- python dictionary containing your parameters

Returns:
parameters -- python dictionary containing your updated parameters
parameters["W" + str(l)] = ...
parameters["b" + str(l)] = ...
"""
parameters = copy.deepcopy(params)
L = len(parameters) // 2 # number of layers in the neural network

*{moderator edit - solution code removed}*``````

did you pass the grader? any changes beyond or outside ##YOUR CODE STARTS HERE -----###YOUR CODES END HERE can cause grader assessment issue.

That does not work, because parameters is a compound object. That creates a new copy of the whole dictionary, but the individual arrays in the dictionary are not duplicated. You need the `deepcopy` as I see you discovered in your later post on this thread.

Yes, so will this change make it to the assignment notebook?

Also thanks for the clarification, I didnâ€™t know the reason why `deepcopy()` works.

Thatâ€™s a good point: the template code is misleading. I will file a bug about this and hope that the course staff will act on it. Should be a simple fix other than that they also have to add the â€śimportâ€ť for the copy package.