Course 1, Week 4, Assignment 1, Exercise 9 - where is dZ?

parrotox · May 8, 2021, 2:51am

I’m confused about Exercise 9 - L_model_backward. I think I might understand how to get current_cache (though I’m not sure because none of my code is working at all), but the next line I don’t understand:
dA_prev_temp, dW_temp, db_temp = ...
I’m pretty sure I’m supposed to use the linear_backward function here, but I need dZ as input for that, which I don’t have. Am I supposed to nest a sigmoid_backward function inside this? That’s the only place I remember calculating dZ before, since in Exercise 7 where we used linear_backward, we were given dZ. Or am I way off base here? In which case, can someone please point me in the right direction?

crisrise · May 8, 2021, 5:20am

Hi @parrotox for the backprop step of the network you need to differentiate back the through the non linear step and the linear step of each layer. In the noteboook there is a function, linear_activation_backward (that uses linear_backward) and Implement the backward propagation for the LINEAR->ACTIVATION layer. Note that the linear_backward function on excercise 7 Implement the linear portion of backward propagation for a single layer (layer l)
Hope this can help you

parrotox · May 8, 2021, 6:48pm

YES! Thank you, I got it now

crisrise · May 8, 2021, 7:32pm

Wonderful @parrotox, happy I could help you out!!

paolaruedad · May 12, 2021, 3:07am

Hey, i’m having trouble with current_cache, i don´t know how to called, someone can help me please, i got this error.

TypeError Traceback (most recent call last)
in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))

in L_model_backward(AL, Y, caches)
41 current_cache = caches
42
—> 43 dA_prev_temp, dW_temp, db_temp =linear_activation_backward(dAL, current_cache, activation = “sigmoid”)
44 grads[“dA” + str(L-1)] = dA_prev_temp
45 grads[“dW” + str(L)] = dW_temp

in linear_activation_backward(dA, cache, activation)
33 # dA_prev, dW, db = …
34 # YOUR CODE STARTS HERE
—> 35 dZ = sigmoid_backward(dA, activation_cache)
36 dA_prev, dW, db = linear_backward(dZ, linear_cache)
37

~/work/release/W4A1/dnn_utils.py in sigmoid_backward(dA, cache)
74 Z = cache
75
—> 76 s = 1/(1+np.exp(-Z))
77 dZ = dA * s * (1-s)
78

TypeError: bad operand type for unary -: ‘tuple’

albertovilla · May 12, 2021, 5:44am

Hi @paolaruedad in the linear_activation_backward where you are getting the error you are using the sigmoid function. Think about in which layers you are using the sigmoid so you can decide what’s the right content for current_cache at that step.

If you have a look at the for loop below where ReLU is used it may help you out too.

lachainone · May 12, 2021, 9:42am

@paolaruedad @paolaruedad

I had the same problem as @paolaruedad that I solved thanks to your advice @albertovilla
Though, now I have a problem with what I believe is the cache of the loop. Here’s my error message.

IndexError                                Traceback (most recent call last)

in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))

in L_model_backward(AL, Y, caches)
66 # YOUR CODE STARTS HERE
67 current_cache = caches[l]
—> 68 dA_prev_temp, dW_temp, db_temp = linear_activation_backward(grads[“dA” + str(l + 1)], current_cache, “relu”)
69 grads[“dA” + str(l)] = dA_prev_temp + str(l)
70 grads[“dW” + str(l + 1)] = dW_temp + str(l + 1)

in linear_activation_backward(dA, cache, activation)
21 # dA_prev, dW, db = …
22 # YOUR CODE STARTS HERE
—> 23 dZ = relu_backward(dA, activation_cache)
24 dA_prev, dW, db = linear_backward(dZ, linear_cache)
25

~/work/release/W4A1/dnn_utils.py in relu_backward(dA, cache)
54
55 # When z <= 0, you should set dz to 0 as well.
—> 56 dZ[Z <= 0] = 0
57
58 assert (dZ.shape == Z.shape)

IndexError: too many indices for array

I have tried every combination reasonable to try of cache and cache[y] but can’t find an answer. Now, I am out of idea to solve this problem. Can you help me?

albertovilla · May 12, 2021, 11:53am

Hi, as the error is happening when calling the relu_backward function I would suggest that you temporarily edit that function (which is defined in the file dnn_utils.py) to print dZ, dZ.shape and Z.shape in that context.

And I say temporarily because you are not expected to modify this file, it is correct as it is, so you only may want to do it in order to debug this issue.

paolaruedad · May 12, 2021, 3:14pm

i’m sorry, but i still not getting it, i just changed a lot of parameters, it is backward propagation, i don’t know what to put in the cache o how to called it, please some idea. it has been a day already, i’m just stuck.

albertovilla · May 12, 2021, 3:25pm

The above is assigning to current_cache the full list of caches but how many layers do you have with sigmoid activation? Just one, so you have to assign to current_cache the right index from caches. Does it help?

paolaruedad · May 12, 2021, 3:36pm

Ok, yes i understand, but i think the problem it is how to called that in python, i’m doing this nameofdelist[+ str(L)], for the sigmoide because it is the 3 layer, and for the relu are two, nameofdelist[+ str(l)], i assumed it is l, because the for loop is doing the iteration, but i still getting the error, how do i supposed to take just the respective layers in python?

albertovilla · May 12, 2021, 3:52pm

I think an example could help.

Let’s assume there are 3 layers, the caches list would have the indexes: 0, 1, 2. So for the last layer, you would should assign caches[2]. Obviously you don’t need to hardcode the numbers you have to use L. Note that the indexes of the list are integers.

lachainone · May 13, 2021, 9:20am

@albertovilla I have no idea how to do this
Why should I change the source code?

albertovilla · May 13, 2021, 9:59am

@lachainone your error is taking place in this line:

This is using a function defined in the file W4A1/dnn_utils.py, in particular the error you are getting is in the statement dZ[Z <= 0] = 0 but you don’t know how dZ relates to your inputs to the function because your parameters are dA and activation_cache.

In order to understand why the error is happening I would suggest you open the dnn_utils.py file and add some print statements so you can backtrace where is the error and then correct the problem in your code.

You can open that file by clicking on the Jupyter logo, you will see a folder release and from there you can navigate to the file and open it.

Alternatively, you could skip editing this file and try to replicate the problem in your Jupyter notebook by noticing how is dZ calculated in that function:

dZ = np.array(dA, copy=True)

The full function is:

def relu_backward(dA, cache):
“”"
Implement the backward propagation for a single RELU unit.

Arguments:
dA -- post-activation gradient, of any shape
cache -- 'Z' where we store for computing backward propagation efficiently

Returns:
dZ -- Gradient of the cost with respect to Z
"""

Z = cache
dZ = np.array(dA, copy=True) # just converting dz to a correct object.
            
# When z <= 0, you should set dz to 0 as well. 
dZ[Z <= 0] = 0

assert (dZ.shape == Z.shape)

return dZ

lachainone · May 14, 2021, 8:53am

@albertovilla
The issue had nothing to do with the cache.
It’s solved now

szymanel · May 14, 2021, 10:51am

@lachainone, how did You solve this problem. I’ve been stuck with it for 2 days now and have no idea how to solve it

szymanel · May 14, 2021, 11:10am

Did anynone have error like above?

Murat_Bayraktar · November 13, 2021, 9:31am

I’m having the same error on my side. I think its something related to the value of dA inside the loop. When I I put it as dAL its gives me the same error but I want to put it as a variable to change each time I loop inside a layer but I dont know how to put it as a variable containing l. did you solve it?

Rashmi · November 19, 2021, 1:28pm

Hello team,

I’m facing the similar sort of problem too while running the codes for exercise 9 (Week 4/Assignment 1). The traceback is hitting the most recent call and error on (dAL). I am not able to figure out what is wrong with the codes in this case? Kindly help. Thanks and regards.

sjfischer · November 19, 2021, 1:56pm

Hi @Rashmi , have a look at your input parameters of the linear_activation_backward function. It requires the gradient which you initialized, not grads[“dAL”] which is the full dictionary of gradients.

Topic		Replies	Views
W4_A1_Ex-9_L_model_backward Neural Networks and Deep Learning	2	511	November 15, 2022
W4_A1_Ex-9_L_Model_Backward_Function Neural Networks and Deep Learning	12	754	August 14, 2023
Assignment E9 L_model Backward Neural Networks and Deep Learning week-4	1	21	August 22, 2024
Stuck in L_model_backward Neural Networks and Deep Learning	2	528	November 8, 2021
Week 4, Assigment 1 L_model_backward Neural Networks and Deep Learning	17	631	August 1, 2021

Course 1, Week 4, Assignment 1, Exercise 9 - where is dZ?

Related topics