Can someone explain me the idea for this formula and why the denominator term is introduced in this formula ??
An example will be of great help
Hello @Kamal_Nayan,
Here is the explanation of the lecture:
And the role for the denominator is just in case any of these vectors are really small or really large, your the denominator turns this formula into a ratio.
Then let’s consider a small and a large example:
-
For both large and small cases, compute ONLY the numerator of the gradient checking formula you quoted in your post. What are the results?
-
For both cases, now compute the whole fraction. What are the results?
-
Can you make sense of what the lecture has explained, based on the above results? Show us the results, and tell us how you now make sense of the lecture’s explanation.
Cheers,
Raymond
I guess the denominator used is for scaling factors such that we don’t get high values as in the first example and don’t get much low values as in the second example
Bingo! In the first example, we see 2.23, and in the second, we see 0.0002. The questions are,
- is 2.23 too large?
- is 0.000283 too large?
When we say whether something is too large, we need some reference to compare to. The denominator provides the reference. After scaling, we know both difference are roughly 0.3% (0.003) of their respective vectors’ lengths. With such properly scaled ratio, we can develop some sense of “what is too large”. Now we can look at the slide again:
Notice the three numbers there? The lecture used them as some kind of a rule of thumb to tell us whether the difference is just great (small enough) and when to worry. Without the denominator, we can’t possibly set up a rule of thumb like that.
Thanks for staying with me on the exercise.
Cheers,
Raymond
Perfect, thanks !!!