C5W1 A2E3 - Cache vs. "parameters", X, Y

AIGeorge · April 17, 2022, 12:56pm

Hi everyone,

Facing the predefined funtions for the optimize function in A1E3, I am just wondering what is the purpose, i.e. the content, of the cache compared to the other parameters given to the function (“parameters”, X, Y)?

Except from a, dont the other parameters contain all the values necessary to perform backprop?

Thanks in advance for helping me out here! Was not on coding for quite a while so probably overseeing something super obvious,

Best,
George

paulinpaloalto · April 17, 2022, 3:41pm

Hi, George.

I think you’re referring to A2 (Dinosaurus Island), not A1 (RNN Step by Step), right? There is no optimize in A1.

The way this works is analogous to the way the caches worked way back in Course 1 Week 4 A1 and A2: the point is that the back propagation formulas depend on some of the intermediate values generated during forward propagation, not just the parameters themselves. E.g. the A and Z values. In a technical sense, you really only need the input data and the parameters, but that would mean you would need to “rerun” parts of forward propagation as you are doing back prop. But that would be extra code and extra compute cycles. So it’s cleaner and easier to sacrifice a bit of memory to hold (“cache”) those intermediate values as you go through forward propagation and then pass them as inputs to back prop. It costs a bit of memory, but saves compute time and code complexity.

AIGeorge · April 18, 2022, 8:08am

Hi Paul,

Thanks a lot! Memories of C1 are coming back.

And yes, I was referring to A2 - corrected the title.

Best,
George

Topic		Replies	Views
Dinosaurus_Island_Character_level_language_model optimize Sequence Models week-1	5	490	January 7, 2024
Week 3, Exercise 4 & 9, why return A2 separately Neural Networks and Deep Learning	4	538	July 2, 2022
Week 3 Exercise 8 - nn_model Neural Networks and Deep Learning	7	613	July 17, 2021
Course2 _week1_Initialization_baclward prop Improving Deep Neural Networks: Hyperparameter tun	2	534	May 14, 2021
More understanding on the use of cache in L-layer deep network Neural Networks and Deep Learning	3	546	June 4, 2021

C5W1 A2E3 - Cache vs. "parameters", X, Y

Related topics