Hi,
For exercise 9 I get the error below… I think it related to the caches… in the comments it’s [l] for Relu and [l-1] for sigmoid… did I miss something?
Caches (((array([[ 0.09649747, -1.8634927 ],
[-0.2773882 , -0.35475898],
[-0.08274148, -0.62700068],
[-0.04381817, -0.47721803]]), array([[-1.31386475, 0.88462238, 0.88131804, 1.70957306],
[ 0.05003364, -0.40467741, -0.54535995, -1.54647732],
[ 0.98236743, -1.10106763, -1.18504653, -0.2056499 ]]), array([[ 1.48614836],
[ 0.23671627],
[-1.02378514]])), array([[-0.7129932 , 0.62524497],
[-0.16051336, -0.76883635],
[-0.23003072, 0.74505627]])), ((array([[ 1.97611078, -1.24412333],
[-0.62641691, -0.80376609],
[-2.41908317, -0.92379202]]), array([[-1.02387576, 1.12397796, -0.13191423]]), array([[-1.62328545]])), array([[ 0.64667545, -0.35627076]])))
ValueError Traceback (most recent call last)
in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))
in L_model_backward(AL, Y, caches)
41 print(“Caches”,caches)
42 current_cache = caches[L-1] # sigmoid
—> 43 dA_prev_temp, dW_temp, db_temp = linear_backward(dAL,current_cache)
44 grads[“dA” + str(L-1)] = dA_prev_temp
45 grads[“dW” + str(L)] =dW_temp
in linear_backward(dZ, cache)
14 db – Gradient of the cost with respect to b (current layer l), same shape as b
15 “”"
—> 16 A_prev, W, b = cache
17 m = A_prev.shape[1]
18
ValueError: not enough values to unpack (expected 3, got 2)
It is a mistake to call linear_backward directly from L_model_backward. Please take a more careful look at the instructions and remember how the “call hierarchy” works for the functions here in the “Step by Step” exercise.
Paul,
I think I have the right function? But, I seem to be having a problem with the right cache values…
Caches (((array([[ 0.09649747, -1.8634927 ],
[-0.2773882 , -0.35475898],
[-0.08274148, -0.62700068],
[-0.04381817, -0.47721803]]), array([[-1.31386475, 0.88462238, 0.88131804, 1.70957306],
[ 0.05003364, -0.40467741, -0.54535995, -1.54647732],
[ 0.98236743, -1.10106763, -1.18504653, -0.2056499 ]]), array([[ 1.48614836],
[ 0.23671627],
[-1.02378514]])), array([[-0.7129932 , 0.62524497],
[-0.16051336, -0.76883635],
[-0.23003072, 0.74505627]])), ((array([[ 1.97611078, -1.24412333],
[-0.62641691, -0.80376609],
[-2.41908317, -0.92379202]]), array([[-1.02387576, 1.12397796, -0.13191423]]), array([[-1.62328545]])), array([[ 0.64667545, -0.35627076]])))
Current ((array([[ 0.09649747, -1.8634927 ],
[-0.2773882 , -0.35475898],
[-0.08274148, -0.62700068],
[-0.04381817, -0.47721803]]), array([[-1.31386475, 0.88462238, 0.88131804, 1.70957306],
[ 0.05003364, -0.40467741, -0.54535995, -1.54647732],
[ 0.98236743, -1.10106763, -1.18504653, -0.2056499 ]]), array([[ 1.48614836],
[ 0.23671627],
[-1.02378514]])), array([[-0.7129932 , 0.62524497],
[-0.16051336, -0.76883635],
[-0.23003072, 0.74505627]]))
IndexError Traceback (most recent call last)
in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))
in L_model_backward(AL, Y, caches)
61 current_cache = caches[l] #relu
62 print(“Current”,current_cache)
—> 63 dA_prev_temp, dW_temp, db_temp = linear_activation_backward(dAL, current_cache, “relu”)
64 grads[“dA” + str(l)] = dA_prev_temp
65 grads[“dW” + str(l + 1)] = dW_temp
in linear_activation_backward(dA, cache, activation)
22 # dA_prev, dW, db = …
23 # YOUR CODE STARTS HERE
—> 24 dZ = relu_backward(dA, activation_cache)
25 dA_prev, dW, db = linear_backward(dZ, linear_cache)
26 # YOUR CODE ENDS HERE
~/work/release/W4A1/dnn_utils.py in relu_backward(dA, cache)
54
55 # When z <= 0, you should set dz to 0 as well.
—> 56 dZ[Z <= 0] = 0
57
58 assert (dZ.shape == Z.shape)
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 3
TMosh
April 17, 2024, 8:31pm
4
Is this the correct activation?
I believe so… relu inside the for loop… the sigmoid activation seems to have worked prior to the for loop
thanks,
rick
TMosh
April 17, 2024, 8:49pm
6
If you’re inside the for-loop, then you shouldn’t be hard-coding the variable “dAL” for the gradients you pass to linear_activation_backward().
this is the call I’m using… inside the loop… the activation is hard coded to relu
{mentor edit: code removed}
TMosh
April 17, 2024, 9:02pm
8
The gradients you’re using are the problem.
TMosh
April 17, 2024, 9:02pm
9
Every layer in that for-loop has different gradients. So you can’t use dAL.
Hint:
But you can use “dA” and then append another letter that is based on the loop counter.
Exactly. Remember that back propagation is the mirror image of forward propagation. In forward prop, at every layer the input is A^{[l-1]} and the output is A^{[l]} , right? So you have to mirror that on back prop.