Week 4 Assignment 2 Exercise 6 Issue

Thanks for adding the point about the potential effect on the gradients. I had not thought of that. Note that none of the variables in question here are mutable, so are not directly affected by backprop, but they would be factors. You can see in my example thread that I used numpy or straight python for the integer arithmetic pieces in some of the formulations and it all still works fine. Normally if you insert a numpy operation anywhere in the compute graph that matters, then it “throws” in an obvious way at gradient.tape time. E.g. even if you do something as simple as use np.transpose where you should use tf.transpose it will fail. Here’s an example of that from DLS C5 which also points to another case in DLS C2 W3.