Course 1 Week 4, what is Activation cache & what are the parameters it contains?

I am confused about the “activation cache” part. the cache contains two tuples, activation_cache, and linear_cache. linear_cache is pretty straight forward it’s basically (A_prev, W, b) but what does activation cache contain?

In Exercise 4 of the first assignment, in which you complete the
linear_activation_forward() function, the line before the return statement assigns two separate caches to form a bigger cache: cache = (linear_cache, activation_cache). Formally, a tuple with two elements (which themselves can have multiple elements).

The values assigned to the activation_cache will depend on whether it is a relu activation or a sigmoid activation, set in the activation argument of the linear_activation_forward(). The mathematics to this is explained in the prelude to Exercise 4. These values are “cached” so that can be used to evaluate the gradient in the backward propagation step.

I hope that this helps! @kenb