What's meaning of " np.linalg.norm(grad) + np.linalg.norm(gradapprox)"

sunson29 · August 31, 2022, 8:18pm

numerator = np.linalg.norm(grad - gradapprox)
denominator = np.linalg.norm(grad) + np.linalg.norm(gradapprox)
difference = numerator / denominator

Hi friend,

In the grad checking, I think the 1st line “np.linalg.norm(grad - gradapprox)” already did everything. If grad = gradapprox, just means

so, if this line 1 is big, something wrong, if small, all good, Am I right? everything is done.

Here, my question, what’s the meaning of this denominator, then you need to do this difference = numerator / denominator ?? What are we doing here? It seems like, we don’t like A-B, but we want to do (A - B)/(A + B) ? Thank you!

paulinpaloalto · August 31, 2022, 9:23pm

The point is that it’s a question of scale, right? Just telling me that the length of the difference is 1.0, how do I know if that’s a big difference or not? If the original vectors have length 1,000,000, then a difference of 1.0 is pretty small. If the original vectors have length 2, then a difference of 1.0 is a big deal.

sunson29 · September 1, 2022, 1:19pm

oh, that make sense. yes, I think it’s a scale question.
May I ask, why norm(A)+norm(B) is the scaler of norm(A-B)? why not something else? my bad, I know this question is silly… but thank you

paulinpaloalto · September 1, 2022, 2:37pm

You could use the norm of just one of the vectors for the scale. If you choose to do that, you’d probably want to use the gradapprox, not grad, since the whole point of this exercise is that the “real” grad may be wrong. That’s what we’re trying to diagnose here, of course. But it doesn’t really matter if you use both. So you could say this is a convention, not an ironclad rule. You might want to adjust what you use as the threshold value for success to be twice as large if you decide to only use the norm of gradapprox in the denominator.

Topic		Replies	Views
Grad check error Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	519	January 13, 2023
[Gradient check] Why isn't the absolute value of `grad - gradapprox` the difference between gradients? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	662	August 26, 2021
Week1 Programming Assignment: Gradient Checking Improving Deep Neural Networks: Hyperparameter tun coursera-platform	13	1167	June 21, 2024
Np.linalg.norm does not count L2 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	10	617	April 1, 2023
Week 1 Gradient Checking gradient_check_n Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	1813	December 3, 2023

What's meaning of " np.linalg.norm(grad) + np.linalg.norm(gradapprox)"

Related topics