Week 4 Step by Step Exercise 9 - L model backward

Hi,

For exercise 9 I get the error below… I think it related to the caches… in the comments it’s [l] for Relu and [l-1] for sigmoid… did I miss something?

Caches (((array([[ 0.09649747, -1.8634927 ],
[-0.2773882 , -0.35475898],
[-0.08274148, -0.62700068],
[-0.04381817, -0.47721803]]), array([[-1.31386475, 0.88462238, 0.88131804, 1.70957306],
[ 0.05003364, -0.40467741, -0.54535995, -1.54647732],
[ 0.98236743, -1.10106763, -1.18504653, -0.2056499 ]]), array([[ 1.48614836],
[ 0.23671627],
[-1.02378514]])), array([[-0.7129932 , 0.62524497],
[-0.16051336, -0.76883635],
[-0.23003072, 0.74505627]])), ((array([[ 1.97611078, -1.24412333],
[-0.62641691, -0.80376609],
[-2.41908317, -0.92379202]]), array([[-1.02387576, 1.12397796, -0.13191423]]), array([[-1.62328545]])), array([[ 0.64667545, -0.35627076]])))

ValueError Traceback (most recent call last)
in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))

in L_model_backward(AL, Y, caches)
41 print(“Caches”,caches)
42 current_cache = caches[L-1] # sigmoid
—> 43 dA_prev_temp, dW_temp, db_temp = linear_backward(dAL,current_cache)
44 grads[“dA” + str(L-1)] = dA_prev_temp
45 grads[“dW” + str(L)] =dW_temp

in linear_backward(dZ, cache)
14 db – Gradient of the cost with respect to b (current layer l), same shape as b
15 “”"
—> 16 A_prev, W, b = cache
17 m = A_prev.shape[1]
18

ValueError: not enough values to unpack (expected 3, got 2)

It is a mistake to call linear_backward directly from L_model_backward. Please take a more careful look at the instructions and remember how the “call hierarchy” works for the functions here in the “Step by Step” exercise.

Paul,
I think I have the right function? But, I seem to be having a problem with the right cache values…

Caches (((array([[ 0.09649747, -1.8634927 ],
[-0.2773882 , -0.35475898],
[-0.08274148, -0.62700068],
[-0.04381817, -0.47721803]]), array([[-1.31386475, 0.88462238, 0.88131804, 1.70957306],
[ 0.05003364, -0.40467741, -0.54535995, -1.54647732],
[ 0.98236743, -1.10106763, -1.18504653, -0.2056499 ]]), array([[ 1.48614836],
[ 0.23671627],
[-1.02378514]])), array([[-0.7129932 , 0.62524497],
[-0.16051336, -0.76883635],
[-0.23003072, 0.74505627]])), ((array([[ 1.97611078, -1.24412333],
[-0.62641691, -0.80376609],
[-2.41908317, -0.92379202]]), array([[-1.02387576, 1.12397796, -0.13191423]]), array([[-1.62328545]])), array([[ 0.64667545, -0.35627076]])))
Current ((array([[ 0.09649747, -1.8634927 ],
[-0.2773882 , -0.35475898],
[-0.08274148, -0.62700068],
[-0.04381817, -0.47721803]]), array([[-1.31386475, 0.88462238, 0.88131804, 1.70957306],
[ 0.05003364, -0.40467741, -0.54535995, -1.54647732],
[ 0.98236743, -1.10106763, -1.18504653, -0.2056499 ]]), array([[ 1.48614836],
[ 0.23671627],
[-1.02378514]])), array([[-0.7129932 , 0.62524497],
[-0.16051336, -0.76883635],
[-0.23003072, 0.74505627]]))

IndexError Traceback (most recent call last)
in
1 t_AL, t_Y_assess, t_caches = L_model_backward_test_case()
----> 2 grads = L_model_backward(t_AL, t_Y_assess, t_caches)
3
4 print("dA0 = " + str(grads[‘dA0’]))
5 print("dA1 = " + str(grads[‘dA1’]))

in L_model_backward(AL, Y, caches)
61 current_cache = caches[l] #relu
62 print(“Current”,current_cache)
—> 63 dA_prev_temp, dW_temp, db_temp = linear_activation_backward(dAL, current_cache, “relu”)
64 grads[“dA” + str(l)] = dA_prev_temp
65 grads[“dW” + str(l + 1)] = dW_temp

in linear_activation_backward(dA, cache, activation)
22 # dA_prev, dW, db = …
23 # YOUR CODE STARTS HERE
—> 24 dZ = relu_backward(dA, activation_cache)
25 dA_prev, dW, db = linear_backward(dZ, linear_cache)
26 # YOUR CODE ENDS HERE

~/work/release/W4A1/dnn_utils.py in relu_backward(dA, cache)
54
55 # When z <= 0, you should set dz to 0 as well.
—> 56 dZ[Z <= 0] = 0
57
58 assert (dZ.shape == Z.shape)

IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 3

Is this the correct activation?

I believe so… relu inside the for loop… the sigmoid activation seems to have worked prior to the for loop

thanks,

rick

If you’re inside the for-loop, then you shouldn’t be hard-coding the variable “dAL” for the gradients you pass to linear_activation_backward().

this is the call I’m using… inside the loop… the activation is hard coded to relu

{mentor edit: code removed}

The gradients you’re using are the problem.

Every layer in that for-loop has different gradients. So you can’t use dAL.

Hint:
But you can use “dA” and then append another letter that is based on the loop counter.

Exactly. Remember that back propagation is the mirror image of forward propagation. In forward prop, at every layer the input is A^{[l-1]} and the output is A^{[l]}, right? So you have to mirror that on back prop.

I got it!.. thanks!