This is our very first introduction to TensorFlow and there is a lot to learn. It’s been quite a while since I listened to the lectures here, so I forget whether Prof Ng discusses how gradients are handled in TF in the lectures. They do mention it briefly in the notebook and tell you only to use TF functions, but they don’t really explain why.
Just for completeness, I went back and ran the experiment I described above of using the numpy transpose in compute_total_cost
. It passes all the test cases in the notebook until you run the section later where it trains the model. Then it throws a big exception trace with this as the final error message:
ValueError: No gradients provided for any variable: ['Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0'].
So it doesn’t specifically say anything about “avoid numpy”, but it does tell you that there is some kind of problem with the gradients. As opposed to the kernel just dying …