C5W1 A2E3 - Cache vs. "parameters", X, Y

Hi everyone,

Facing the predefined funtions for the optimize function in A1E3, I am just wondering what is the purpose, i.e. the content, of the cache compared to the other parameters given to the function (“parameters”, X, Y)?

Except from a, dont the other parameters contain all the values necessary to perform backprop?

Thanks in advance for helping me out here! Was not on coding for quite a while so probably overseeing something super obvious,

Best,
George

Hi, George.

I think you’re referring to A2 (Dinosaurus Island), not A1 (RNN Step by Step), right? There is no optimize in A1.

The way this works is analogous to the way the caches worked way back in Course 1 Week 4 A1 and A2: the point is that the back propagation formulas depend on some of the intermediate values generated during forward propagation, not just the parameters themselves. E.g. the A and Z values. In a technical sense, you really only need the input data and the parameters, but that would mean you would need to “rerun” parts of forward propagation as you are doing back prop. But that would be extra code and extra compute cycles. So it’s cleaner and easier to sacrifice a bit of memory to hold (“cache”) those intermediate values as you go through forward propagation and then pass them as inputs to back prop. It costs a bit of memory, but saves compute time and code complexity.

Hi Paul,

Thanks a lot! Memories of C1 are coming back.

And yes, I was referring to A2 - corrected the title.

Best,
George