Inconsistent Gradient Checking algorithm question

tkfx2000 · November 25, 2025, 1:48am

I am debugging the assignment W1A3 Gradient_Checking. What I notice is that the formula for gradient checking is different in the course notes and on the assignement.

The notes/slide show the formula as || dtheta_approx - dtheta || (subscript)2 / || dtheta_approx|| (subscript)2 - || dtheta|| (subscript)2

or (in Latex)

$$ difference = \frac {\mid\mid grad - gradapprox \mid\mid_2}{\mid\mid grad \mid\mid_2 + \mid\mid gradapprox \mid\mid_2} \tag{2}$$

vs.

$$ difference = \frac {\mid\mid gradapprox - grad \mid\mid_2}{\mid\mid gradapprox \mid\mid_2 + \mid\mid grad \mid\mid_2} \tag{2}$$

However,

the assignment describes and reverses the elements of the formula….as

|| dtheta - dtheta_approx || (subscript)2 / || dtheta|| (subscript)2 - || dtheta_approx|| (subscript)2

While I understand this is almost…a Euclidean distance formula, I think the results may differ for each implementation, but I am not truly sure. So I have two questions:

Does the order matter?
Does the use of the “subscript 2” in the formulas (both in the slides and the assignment) mean “raised to the power of 2”? If so, shouldn’t they be a superscript? and if not, why not?

Thanks for any guidance.

paulinpaloalto · November 25, 2025, 1:58am

The forums support LaTeX as described on the DLS FAQ Thread. Using your formulas with the local syntax for LaTeX for clarity gives:

difference = \frac {\mid\mid grad - gradapprox \mid\mid_2}{\mid\mid grad \mid\mid_2 + \mid\mid gradapprox \mid\mid_2} \tag{2}

Versus

\frac{|| dtheta - dthetaapprox ||_2}{ || dtheta||_2 - || dthetaapprox||_2}

The second formula with the subtraction in the denominator is incorrect. The course notes are not really maintained, but I will report that as a bug.

The first formula is correct and that is what is shown in the assignment. The order does not matter because addition is commutative and in the case of the numerator, we are taking the norm of the difference so it wouldn’t matter there if you reversed the operands.

The subscript 2 means that the norms are the “2-norm” which is the Euclidean distance.

tkfx2000 · November 25, 2025, 2:11am

Thanks for the quick response, Paul!

Could you clarify “2-norm” in this instance? How do I implement that whatever that means? If in fact the subscript 2 acts as an operand….Thanks. In Euclidean, as I understand it, it would mean sum of squares, so are you sating it acts as raised to the power of 2?

paulinpaloalto · November 25, 2025, 2:12am

Here is the screen shot of that page from the course notes:

So the denominator is correct and is not what you showed. The numerator has the subtraction reversed, but that doesn’t matter because we are taking the norm which is the multi-dimensional equivalent of taking an absolute value. So the order doesn’t matter.

paulinpaloalto · November 25, 2025, 2:14am

The 2-norm is the square root of the sum of the squares of the elements of the vector.

||v||_2 = \sqrt {\displaystyle \sum_{i = 1}^n v_i^2}

which is the Euclidean length of the vector.

In numpy, you implement that using np.linalg.norm.

paulinpaloalto · November 25, 2025, 2:16am

Here’s what google has to say:

tkfx2000 · November 25, 2025, 2:19am

Thanks Paul. So it seems the subscript “2“ has massively overloaded meaning when coupled with absolute value….||x||2 …. thanks. I did not know that, perhaps I missed that in the lecture. Thank you so much for the help!

TMosh · November 25, 2025, 2:29am

The double-vertical bar isn’t an absolute value operator. It’s the “Norm” operator.

tkfx2000 · November 25, 2025, 2:45am

Thanks you!

tkfx2000 · November 25, 2025, 2:52am

That solved my issue, I replaced abs() with np.linalg.norm(). Thanks so much for the help!

paulinpaloalto · November 25, 2025, 3:20am

Professor Ng mentions that at about 3:44 in the lecture on Gradient Checking. Here’s a screenshot of the transcript at that point:

Topic		Replies	Views
C2W1 - Theory behind Gradient Checking formula? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	11	572	August 7, 2023
Gradient_check_n : Wrong Value Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	597	August 15, 2021
DLS 2, Week 1, Assignment 3 - Gradient Checking final part Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	594	October 30, 2022
DLS2 Week1 Assignment 3 gradient_check_n Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	775	July 20, 2023
Course 2 week 1 gradient checking, last task Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	433	August 8, 2023

Inconsistent Gradient Checking algorithm question

Related topics