Hello: I have been stuck for weeks trying to figure out what I did wrong. I keep getting error messages such as
“dZ = relu_backward…”, “dZ[Z<=0] = 0”
and “too many indices for array”
Thank you in advance for any help.
Hello: I have been stuck for weeks trying to figure out what I did wrong. I keep getting error messages such as
“dZ = relu_backward…”, “dZ[Z<=0] = 0”
and “too many indices for array”
Thank you in advance for any help.
This is a common place to get stuck. Here is another recent thread with some suggestions about how to approach this.
If the material on that thread doesn’t shed any light, then the next step is for you to show us the full exception trace you are seeing, just as the student did on that other thread. Having more context will allow us to give more specific help.
Thank you for your prompt reply.
I have dA1, dW2 and db2 right;
dA0, dW1 and db1, no.
Here is the error message.
IndexError Traceback (most recent call last)
in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))
in L_model_backward(AL, Y, caches)
114 if l >= 0:
115 current_cache = caches[l+1]
→ 116 dA_prev_temp, dW_temp, db_temp = linear_activation_backward(“dA” + str(l+1), current_cache, activation=“relu”)
117 grads[“dA” + str(l)] = dA_prev_temp
118 grads[“dW” + str(l+1)] = dW_temp
in linear_activation_backward(dA, cache, activation)
22 # dA_prev, dW, db = …
23 # YOUR CODE STARTS HERE
—> 24 dZ = relu_backward(dA, activation_cache)
25 dA_prev, dW, db = linear_backward(dZ, linear_cache)
26 # YOUR CODE ENDS HERE
~/work/release/W4A1/dnn_utils.py in relu_backward(dA, cache)
54
55 # When z <= 0, you should set dz to 0 as well.
—> 56 dZ[Z <= 0] = 0
57
58 assert (dZ.shape == Z.shape)
IndexError: too many indices for array
Why do you need that condition “if l >= 0
”? Print the value of l and you should see that it will never be less than 0.
I think the problem is exactly with that indexing. Notice that the way you have written the code, you will never access caches[0]
, right? Remember that indexing in python is 0-based, so if there are two layers that we call layer 1 (hidden layer) and layer 2 (output layer), then the cache entries are caches[0]
and caches[1]
, right?
I understand your point. I don’t need this instruction.
Thje error message is the same with or without it
exactly, caches 0 and 1.
And does fixing the way you index caches
cure the problem?
No, it doesn’t
Is there any difference in the exception trace you get?
Have you done the “dimensional analysis” on the test case? How many layers are there and what are the shapes? Your error shows that the dA and the cache do not match, so you need to figure out how that happened.
Ok. I am looking into it. Thanks
I found my error. Thanks a million!
That’s great news! Onward!