Hey @Reinier_de_Valk,
The later is true for the case for backprop. Let’s understand it with the help of a very simple example. Consider the convolution operation. We will examine the conv_forward
and conv_backward
functions to see how’s this the case.
In the conv_forward
function, we can see that A_prev, W and b
are the input, and Z
is the output (here, I am considering only some of the IO). Here, a_prev
denotes the output activations from the previous layer, and Z
is used to compute the activations for the current layer, i.e., the input is on the left and output is on the right (visually).
On the contrary, in the conv_backward
function, dZ
is the input and dA_prev, dW and db
are the outputs. Here, dZ
represents the gradients with respect to the current conv layer’s output, and dA_prev
represents the gradients with respect to the activations of the previous layer, i.e., the input is on the right and output is on the left (visually).
Now, since you have managed to confuse yourself, allow me to confuse you a little bit more, and if possible, dissolve that confusion as well.
In the conv_backward
function, it is mentioned dA_prev -- gradient of the cost with respect to the input of the conv layer (A_prev)
. Now, here, you might get confused what exactly is input
referring to, the right or the left. Remember that in this docstring, the input
is referring to A_prev
, i.e., the activations and not any gradients, hence, it should naturally bring you the image of forward propagation. This is just a reference to forward propagation that is mentioned in the backward propagation, and make sure that this doesn’t confuse you whatsoever. I hope this helps.
Regards,
Elemento