Point of Gradient Checking

I am not quite sure I understood the point of gradient checking. I understand the theory and the reason why the calculation makes sense to verify that a derivative is properly done.

But I don’t see why we would apply this in neural networks. In the courses and small tests I have done so far on my own, calculating derivatives is something that is not needed since the ML packages do it alone.

So, I don’t understand why we would do grad check? Is it even possible that for example Keras “does a mistake” during the optimization?

I am especially surprised by Andrew’s statement that gradient checking has been useful to him many times.

Thank you in advance for the clarification.

Hello Adrian,

Gradient checking would be necessary if you build your own neural network from scratch. Keras is nice for prototyping and even for production but for some specific applications you may need a tailor-made network.

Another advantage of understanding gradient checking is that you understand what is happening under the hood. That’s why it’s being taught in master studies. It may help you to debug problems much faster when you understand how the libraries such as Keras work.

Hope this helps.


Hi Bahadir,
thank you very much for your prompt reply, it was clarifying and helpful.
Best regards,

1 Like