Building Block NN cache Z,W,b

Hi Sir,

In the Building block of deep neural network lecture video, told cache need to apply for W,b and Z as well. we had couple of doubts regarding this need of cache. can you please help to clarify ?

  1. In below pic, we can just reuse Z[1], no need to compute again, we already computed Z[1] during forward propagation so we can just simply reuse, if so cache not required for Z[1] right. Because the need of cache cames in to picture to avoid recompute Z[1]. Here in backpropagation we dont need to compute Z[1] again so we dont need cache Z[1] right. We can just simply able to reuse Z[1] without the need of cache. Then why proffesor Andrew ng told like Z[1] need to be cache.
    cache

  2. IF cache operations must for Z[1] means then why so for W and b parameters. We could not understand the need of cache for W and b parameters because we are not going to compute W and b right…i dont see any kind of computations for W and b in backpropagation, its just random initilization. Can u please tell why cache need for W and b in terms of backpropagation ?

The point is that once we get to the fully general model that we have here in Week 4, all the forward propagation happens first, right? Then we start back propagation and we do it one layer at a time. So, sure, we computed Z1 during forward propagation at layer 1, but how do I access that when I’m in the back propagation calculation for layer 1? Suppose I have a 3 layer network. I did forward prop for 1, 2 and 3. And now you’ve back propagated through layer 3 to 2 and are now trying to do it for layer 1 using the formula you showed? Where do you get those “precomputed” values from? The answer is: you get them from the caches that you saved during forward propagation. Now you can argue that the b value actually isn’t used in back prop. Fair enough, but if they left it out, then they’d need to explain why they did that.

Note that no-one is saying this is the only way to accomplish this kind of task. E.g. another solution would just to maintain a global dictionary that has everything you might need to reference later. So when you build this for yourself, you can do it anyway you like. But for the purposes of this assignment, you need to deal with the structures as they have designed them.

1 Like