Week4 - Building Blocks of Deep Neural Networks

Szeka_Atti · October 23, 2021, 3:54pm

I have a question regarding this lecture, just asking out of curiosity, please feel free to ignore it if it does not makes sense.

In this video at 5:25 timestamp profesor Andrew Ng explains that you could calculate da0 which is the gradient from the 1st hidden-layer to the input-layer X.

My questions are,

can this da0 value considered some sort of “neural-network” error
and are there any techniques to leverage this error in our network and somehow forward propagate this “backward propagation” error

paulinpaloalto · October 23, 2021, 5:01pm

The point Prof Ng is making here is that it is just an “artifact” of the way back propagation works that you end up generating a gradient for A0 (X) at the first hidden layer. Of course the point of running back propagation at that layer is that you do need the gradients for W1 and b1. And just because of the way the general algorithm works, you end up generating dA0 as a side effect. But there is literally no use for that value: what would it mean to “improve” the inputs? The inputs are the inputs: that’s kind of the point, right? So we simply ignore dA0. Of course dA1 and dA2 (etc) are used in the calculations for the relevant layers.

Szeka_Atti · October 27, 2021, 10:27am

That makes sense, thanks for explaining @paulinpaloalto .

I suppose in a well-functioning network dA0 is/should be relatively small (since the network fits the input-data quite well)? at least that what I’d think intuitively

paulinpaloalto · October 27, 2021, 3:38pm

That’s an interesting intuition! You could instrument your code to see whether that happens or not. E.g. compute the 2-norm of dA0 as a measure of how “big” the gradients are in aggregate and see if it decreases as the training converges to a better and better solution. It would be interesting to know if you learn anything from that type of investigation!

But the overall point is that we have no direct use for dA0 and just end up ignoring it.

Topic		Replies	Views
W4 _L_model_backward_Why do we need dA0 Neural Networks and Deep Learning coursera-platform	2	508	December 27, 2022
W4_A1_Video Lecture on Forward & Backward functions Neural Networks and Deep Learning coursera-platform	4	548	January 15, 2023
Assignment Building NN C1 Week 4 Neural Networks and Deep Learning coursera-platform	11	621	August 16, 2022
Why is there no A0 or X in the backward chain? Week4, Assignment1 Neural Networks and Deep Learning week-module-4 , coursera-platform	10	96	June 30, 2024
week4_Exercise 9 - L_model_backward Neural Networks and Deep Learning coursera-platform	1	536	August 14, 2022

Week4 - Building Blocks of Deep Neural Networks

Related topics