How to get dZ input for conv_backward?

Lee2 · July 6, 2022, 2:46am

Here I would like to know how I can get dZ which is the input of conv_backward. There is no specific steps or calculation in programming exercise for dZ. I only find this sentence: dZ: the gradient of the cost with respect to the output of the conv layer Z at the hth row and wth column (corresponding to the dot product taken at the ith stride left and jth stride down). I want more specific calculations or explanations. Thanks a lot.

TMosh · July 6, 2022, 3:04am

dZ is the ‘Z’ output of conv_forward.

Lee2 · July 6, 2022, 3:13am

So you mean for def conv_backward(dZ, cache), here dZ can be obtained by
def conv_forward(A_prev, W, b, hparameters): … return Z, cache
and dZ=Z?

TMosh · July 6, 2022, 3:51am

Yes, that appears to be how the assignment is designed.

The reason it’s confusing is that conv_forward() never really gives a definition for ‘Z’.

Lee2 · July 6, 2022, 4:01am

Thanks so much! Got it

Lee2 · July 6, 2022, 8:30am

Btw, I would like to know so if in the conv layer dZ=Z, how the difference between the fully connected layer results and Y_train back propagates to the conv layer and optimize weights and bias? It seems that the conv_forward(A_prev, W, b, hparameters) and conv_backward(dZ, cache) becomes a closed loop, and W and b can update without parameters in the max pooling or fully connected layers if dZ=Z.

anon57530071 · July 6, 2022, 12:05pm

I think this is a good question. The answer is dZ is not equal to Z.

What is the starting point of a back-propagation ? It’s a loss function to evaluate differences between the expected values (in supervised learning) and calculated values.
And, if you think about the forward propagation step, typically an activation function “\sigma” is applied to Z. Then, we get “a”, and pass to the next layer.

Starting from the cost function, at first, we calculate \frac{\partial L}{\partial a}. Then, calculate \frac{\partial L}{\partial z} and so on. So, dZ comes from an upper layer.

The important thing is, when dZ is given from the upper layer (or even from an activation function of a same neuron), you need to calculate dW and db which are “weights” for this convolution layer. “weights” update for a convolution layer is actually “filter” update. So, it is quite important for you to update those filters based on losses back-propagated from higher layers.

For this exercise, we do not have a loss function. We also do not have any further definition of the network.
As the shape for dZ is very complex which depends on the total network structure till this layer, what we can do is to reuse “forward propagation” function to get Z, and reuse Z as an example of dZ, since the shape is same.

So, dZ is not equal to Z. We just borrow Z as an initial value for dZ to calculate dW and db.

TMosh · July 6, 2022, 5:02pm

It would be a nice enhancement if this was explained in the notebook.

Lee2 · July 8, 2022, 2:11am

Thanks so much! That’s how I understand it. I just stuck in dZ in the conv layer. As you said, it depends on the total network structure till the conv layer. In the artificial neural network, we can simply use dZ[i]=(W[i+1].T) * dZ[i+1] .* g’(Z[i]) to derive dZ, and the dZ[i+1] contains the information from the upper layer. If we just reuse Z in the conv layer as an example of dZ, the information from the upper layer (like fully connected layer) cannot back-propagate to the conv layer. Though in Python we can use tensorflow to substitute the complex calculation inside this part, but I still want to figure out it manually. Do you have any resource or link to illustrate the detail CNN dZ derivation? Thank you so much!

Topic		Replies	Views
C4_W1, programming assignment 1, backprop: terminology confusion Convolutional Neural Networks coursera-platform	3	559	May 19, 2022
C4W1 CNN back propagation Convolutional Neural Networks coursera-platform	1	621	November 2, 2021
Conv_backward() problem Convolutional Neural Networks coursera-platform	20	1030	July 29, 2023
Backpropagation in CNN Convolutional Neural Networks coursera-platform	3	554	September 23, 2023
Course1_dls_week4: Building your deep neural network step by step_ where is dz being calculated? Neural Networks and Deep Learning coursera-platform	1	523	October 5, 2022

How to get dZ input for conv_backward?

Related topics