I am debugging the assignment W1A3 Gradient_Checking. What I notice is that the formula for gradient checking is different in the course notes and on the assignement.
The notes/slide show the formula as || dtheta_approx - dtheta || (subscript)2 / || dtheta_approx|| (subscript)2 - || dtheta|| (subscript)2
or (in Latex)
$$ difference = \frac {\mid\mid grad - gradapprox \mid\mid_2}{\mid\mid grad \mid\mid_2 + \mid\mid gradapprox \mid\mid_2} \tag{2}$$
vs.
$$ difference = \frac {\mid\mid gradapprox - grad \mid\mid_2}{\mid\mid gradapprox \mid\mid_2 + \mid\mid grad \mid\mid_2} \tag{2}$$
However,
the assignment describes and reverses the elements of the formula….as
While I understand this is almost…a Euclidean distance formula, I think the results may differ for each implementation, but I am not truly sure. So I have two questions:
Does the order matter?
Does the use of the “subscript 2” in the formulas (both in the slides and the assignment) mean “raised to the power of 2”? If so, shouldn’t they be a superscript? and if not, why not?
The second formula with the subtraction in the denominator is incorrect. The course notes are not really maintained, but I will report that as a bug.
The first formula is correct and that is what is shown in the assignment. The order does not matter because addition is commutative and in the case of the numerator, we are taking the norm of the difference so it wouldn’t matter there if you reversed the operands.
The subscript 2 means that the norms are the “2-norm” which is the Euclidean distance.
Could you clarify “2-norm” in this instance? How do I implement that whatever that means? If in fact the subscript 2 acts as an operand….Thanks. In Euclidean, as I understand it, it would mean sum of squares, so are you sating it acts as raised to the power of 2?
So the denominator is correct and is not what you showed. The numerator has the subtraction reversed, but that doesn’t matter because we are taking the norm which is the multi-dimensional equivalent of taking an absolute value. So the order doesn’t matter.
Thanks Paul. So it seems the subscript “2“ has massively overloaded meaning when coupled with absolute value….||x||2 …. thanks. I did not know that, perhaps I missed that in the lecture. Thank you so much for the help!