When to use Gradient Checking?

Svetlana_Verthein · April 22, 2023, 8:44pm

Hello
I understand how gradient checking works and what it is used for, however I am not clear on what types of bugs would necessitate the use of grad checking other than incorrect implementation of the mathematical formulas for gradients. But would the model even converge in such case? And if it does converge - how would one even know that a grad check should be used at all?
Any other types of bugs or scenarios that would prompt the use of grad check? And how might they manifest themselves?
thank you!

paulinpaloalto · April 22, 2023, 9:08pm

That is exactly the case that it is intended for.

That’s a good point. It would all depend on the exact nature of the bug(s), I suppose. You could imagine a case in which it converges, but not as quickly as it should because the gradient values were incorrectly computed to be lower than they should have been. But that’s only one particular type of bug that one can imagine. It’s also possible that the incorrect gradients wouldn’t even point in the right directions, so you’d get divergence no matter what learning rate you chose.

But the higher level point here is that in “real world” solutions these days, nobody builds their own gradient algorithms any more: everyone uses a framework like TensorFlow or PyTorch or the like. And in those systems, the gradients are computed for you using a technique called “numeric differentiation” or “automatic differentiation”, which is based on the very same idea as Gradient Checking, which is finite differences. So maybe the pedagogical value of this section is to point out that you can actually approximate the gradients with an algorithm, although it’s more expensive to compute than using the analytic derivatives. I have not looked into how gradients are implemented in TF, but it’s possible they include the analytic gradients for commonly used functions (e.g. all the standard activation functions).

We’ll get introduced to TF in Week 3 of DLS C2, so stay tuned for that.

Topic		Replies	Views
Point of Gradient Checking Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	579	April 25, 2021
Grad check Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	519	August 9, 2023
Week 1 Is Gradient Checking Useful in Practice? Improving Deep Neural Networks: Hyperparameter tun week-1 , coursera-platform	1	225	January 15, 2024
Week 1 Gradient checking assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	641	May 24, 2021
Fixing backward_propagation_n in grad checking assignment, week 1 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	533	October 26, 2022

When to use Gradient Checking?

Related topics