I don’t understand what is stored by activation cache?
Linear cache stores A,W and b, so what is left to be stored by activation_cache?
If you know where in lectures is that explained, please tell me.
Does doing L_model_backward
give you a better picture?
This goes back to computation graph lecture that shows how to compute derivates.
Here’s the link
Thanks for reply.
No really. Also in that lecture activation cache is not even mentioned.
That may be just a name for this implementation. Actually, we create “cache” by cache = (linear_cache, activation_cache)
. And, cache is used for backward-propagation.
One layer consists of “linear operation” and “activation” that can be seen in the above figure.
A linear operation is simple, but an activation is optional with a variety of activation functions like sigmoid, relu, tanh, and so on. To clearly separate the role and responsibility for implementation, in this assignment, variables used for a linear operation like a, w, and b are stored in a “linear cache” and a variable used for an activation, i.e., z is stored in an activation cache.
And, eventually, a linear cache and an activation cache are merged, and a cache is created. I think that is the flow to create a cache.
Hope this helps.
It helps, thank you so much!
Hi,
It might be helpful to include this explanation in this notebook: Programming Assignment: “Building your Deep Neural Network: Step by Step”
I took a while to figure it out but it’s not explained.