C4_W1, programming assignment 1, backprop: terminology confusion

Hi all,

I managed to confuse myself going through the (ungraded) backprop part of this programming assignment. My issue is with the use of the terms ‘input’ and ‘output’ of a layer. In forward prop, they are unambiguous: input is what goes into a layer, i.e., what is visually ‘to the left’ of the layer, and output what is ‘to the right’.

Backprop, however, flows from right to left. Do input and output of a layer (and, by extension, things like A_prev) mean the same here? Or should we now consider input what is visually to the right of a layer, and output what is to the left?

Going by the docstrings in conv_backward() and pool_backward(), the former seems to be the case. Am I right?

Thank you for any help!

Hey @Reinier_de_Valk,
The later is true for the case for backprop. Let’s understand it with the help of a very simple example. Consider the convolution operation. We will examine the conv_forward and conv_backward functions to see how’s this the case.

In the conv_forward function, we can see that A_prev, W and b are the input, and Z is the output (here, I am considering only some of the IO). Here, a_prev denotes the output activations from the previous layer, and Z is used to compute the activations for the current layer, i.e., the input is on the left and output is on the right (visually).

On the contrary, in the conv_backward function, dZ is the input and dA_prev, dW and db are the outputs. Here, dZ represents the gradients with respect to the current conv layer’s output, and dA_prev represents the gradients with respect to the activations of the previous layer, i.e., the input is on the right and output is on the left (visually).

Now, since you have managed to confuse yourself, allow me to confuse you a little bit more, and if possible, dissolve that confusion as well.

In the conv_backward function, it is mentioned dA_prev -- gradient of the cost with respect to the input of the conv layer (A_prev). Now, here, you might get confused what exactly is input referring to, the right or the left. Remember that in this docstring, the input is referring to A_prev, i.e., the activations and not any gradients, hence, it should naturally bring you the image of forward propagation. This is just a reference to forward propagation that is mentioned in the backward propagation, and make sure that this doesn’t confuse you whatsoever. I hope this helps.

Regards,
Elemento

1 Like

Hi @Elemento,

Thank you very much for the explanation, I think it clicked now!

I guess the takeaway for me is not to pay too much attention to descriptions such as ‘input’ and ‘output’, but instead look at the variable names and keep in mind how things works.

Suppose we have a convolutional layer l. In forward prop, then,

  • A_prev, W and b are the inputs (visually, left) for l
  • Z is the output (visually, right) for l

while in backward prop,

  • dZ is the input (visually, right) for l
  • dA_prev, dW and db are the outputs (visually, left) for l

Is this correct?

Yes @Reinier_de_Valk,
You are absolutely correct. The names of the variables can change, and depending on the different types of networks, different types of layers, the inputs and outputs can change as well.

But once you have become comfortable with the flow of forward propagation and backward propagation, I don’t think you should have any problem in understanding what is the input and what is the output for each of the layer.

Regards,
Elemento