Hi

I’m working on gradient checking for N-layer neural network. The approximation of gradients is very close to the real gradients used in back propagation. But, when I use the equation presented in lecture, the result is unbelievably high.

For example the gradient is 1.26596283e-06 and its approximation is -1.26620936e-6.

I’ve checked gradient and all is like the mentioned example. But when I compute the ||dtheta_approx - dtheta|| / ||dtheta_approx|| + ||dtheta||, I receive 16.7335560670404, that is very very big.

Does anybody have any idea?

Please implement the equation correctly keeping in mind that brackets matter for computing the denominator term:

difference = \frac {\| grad - gradapprox \|_2}{\| grad \|_2 + \| gradapprox \|_2 } \tag{3}

Thanks for your answer, but i’ve implemented exactly what you’ve presented here regarding the position of brackets.

This is what i’ve done:

difference = np.linalg.norm(grad - gradapprox) / (np.linalg.norm(grad) + np.linalg.norm(grad_approx))

Please look at your original post. You are calculating

difference = \frac {\| grad - gradapprox \|_2}{\| grad \|_2} + \| gradapprox \|_2

If your implementation of `gradapprox`

and `grad`

are correct, please click my name and message your notebook as an attachment.

Thanks a million. I sent my code.

Please share the full notebook and not the function and its supporting code.

Thanks again for your answers.

I finally found the solution. I used np.squeeze(grad_approx) and the difference became 2.7245835408876563e-08 as I expected.