Hi @Fatma_Shaban ,
At the start of the linear_activation_backward() function, there are comment lines explaining the input arguments for this function:
Arguments:
dA – post-activation gradient for current layer l
cache – tuple of values (linear_cache, activation_cache) we store for computing backward propagation efficiently
activation – the activation to be used in this layer, stored as a text string: “sigmoid” or "relu
As you can see from the ‘cache’ argument, it contains the activation_cache for the use of the formula’s calculation. The content of this cache is obtained during the forward pass through the network.
Note that in linear_activation_backward function, we have linear_cache and activation_cache. You have to use the correct one when calling the relu_backward, sigmoid_backward, and linear_backward functions.
Hi @Fatma_Shaban ,
Firstly, two functions, sigmoid_backward() or relu_backward(), are provided for you to calculate the dz depending on which activation function is specified to use for the calculation. So there is no need to do np.multiply().
Secondly, ‘cache’ consists of ‘linear_cache’ and ‘activation_cache’. When calling linear_backward(), the parameters pass to the function should be dz and the linear_cache, as it contains the activation output ‘A’ of previous layer, W and b, the weight matrix and bias of current layer respectively. These are the information needed when back propagating through the network. If you refer to linear_forward() funciton, you can see what the linear cache contains.
Thanks for the clear explaniation.


