I am currently doing Week 4 programming assignment 1, I came across terms linear cache and activation cache which I am not aware of. What I do see is that only Z, W, b are only cache mentioned in lectures and they can be used to solve back propagation correctly.
So, my question is what are these terms and how are they calculated along with derivations for where they are used back propagation would be helpful.
We have three terms, cache, linear cache, and activation cache.
cache – a python tuple containing “linear_cache” and “activation_cache”.
linear_cache – a python tuple containing “A^{[l-1]}”, “W^{[l]}” and “b^{[l]}”
activation_cache – if I recall it correctly, a python dictionary containing “A^{[l]}”. Please see relu function from dnn_utils.py file.
Also, check back propagation arguments.
The “activation cache” contains Z^{[l]}, not A^{[l]}. But the other general thing to say here is that the terms “linear cache” and “activation cache” are not some kind of industry standard terminology: they are just very specific to how they have us write this particular code in this particular notebook. During forward propagation, we save the values that we are going to need later when we do backward propagation, so that we don’t have to compute them twice.
All this was explained in the notebook: you just have to read carefully, including studying all the template code that they gave us. They actually did most of that cache related work for us in the template code: e.g. in the linear_activation_backward template code notice how they did the work for us of parsing the layer cache entry into the linear and activation cache variables. We just have to pay attention and understand what we are seeing there.
Hi, I am with the same problem, I don’t know what to put in activation_cache and linear_cache. I tried putting (W,A_prev,b) for linear cache and Z for activation cache but still not passing the tests. I have got the following output:
With sigmoid: A = (array([[0.96890023, 0.11013289]]), array([[ 3.43896131, -2.08938436]]))
With ReLU: A = (array([[3.43896131, 0. ]]), array([[ 3.43896131, -2.08938436]]))
Error: Wrong shape with sigmoid activation for variable 0.
Error: Wrong shape with sigmoid activation for variable 0.
Error: Wrong shape with sigmoid activation for variable 1.
Error: Wrong shape with sigmoid activation for variable 1.
Error: Wrong output with sigmoid activation for variable 0.
Error: Wrong output with sigmoid activation for variable 0.
Error: Wrong output with sigmoid activation for variable 1.
Error: Wrong output with sigmoid activation for variable 1.
Error: Wrong shape with relu activation for variable 0.
Error: Wrong shape with relu activation for variable 0.
Error: Wrong shape with relu activation for variable 1.
Error: Wrong shape with relu activation for variable 1.
Error: Wrong output with relu activation for variable 0.
Error: Wrong output with relu activation for variable 0.
Error: Wrong output with relu activation for variable 1.
Error: Wrong output with relu activation for variable 1.
0 Tests passed
6 Tests failed
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[25], line 9
6 t_A, t_linear_activation_cache = linear_activation_forward(t_A_prev, t_W, t_b, activation = "relu")
7 print("With ReLU: A = " + str(t_A))
----> 9 linear_activation_forward_test(linear_activation_forward)
File ~/work/public_tests.py:244, in linear_activation_forward_test(target)
204 expected_output_relu = (expected_A_relu, expected_cache)
205 test_cases = [
206 {
207 "name":"datatype_check",
(...)
241 }
242 ]
--> 244 multiple_test(test_cases, target)
File ~/work/test_utils.py:142, in multiple_test(test_cases, target)
140 print('\033[92m', success," Tests passed")
141 print('\033[91m', len(test_cases) - success, " Tests failed")
--> 142 raise AssertionError("Not all tests were passed for {}. Check your equations and avoid using global variables inside the function.".format(target.__name__))
AssertionError: Not all tests were passed for linear_activation_forward. Check your equations and avoid using global variables inside the function.
I finally could solve it. It is all in the instructions although these are not very clear. My problem was I tried to implement the linear and activation functions myself using numpy instead of using sigmoid, relu and linear_forward functions that are already implemented in the code. If you use numpy the tests will not pass