Week 4: Why activation function returns Z?

Hello. I’ve completed course already and tried to implement everything from scratch on my own to understand it better. One thing I’m missing: Why activation function gets z only to return it? If I call sigmoid(z) I already have z. It’s confusing as it all happens under the hood, in activation function imported from provided file. What adds to confusion is that when we call sigmoid we store it into activation_cache and return again. Couldn’t we just return z directly? Am I missing something?

The point of the caches is that they are used to pass values from the forward propagation phase to the backward propagation phase. There are some values that are not the actual answers that you need in order to compute the gradients and the Z value is one of them. You could have simply called the return value Z, I suppose, but they are using the name to emphasize the function that is being performed and also by analogy with the “linear cache” which has 3 entries and is (thus) returned as a python tuple.