Hi there! I was working on the programming assignment for course 1 week 4, and i came across the activation functions that were imported from the dnn library. I noticed for the backwards activation function, we are passing the activation cache ie(
dZ = sigmoid_backward(dA, activation_cache)), and I was confused about this part as dZ = dA * g’ (Z) is taking in the linear Z. Can somebody help me out? Thank you
That is what the activation cache contains, right? It is just Z. The entire cache entry for one of the layers looks like this:
((A, W, b), Z)
So it is a “tuple” with two elements. The first element is the “linear cache”, which is a 3 tuple. The second element is the “activation cache” which just contains Z. None of this should be a mystery: you saw the code that created the caches when you were writing the forward propagation logic. If you weren’t paying attention at the time, now would be a good time to go back and have another look at that code. Note that they gave you the logic in the template code for linear_activation_backward to extract the linear and activation caches.
If you want to examine the source for the backward activation functions, just click “File → Open” and then open the utility source file.