Why does parenthesis make difference in calculating gradient

When working on notebook I realized that calculating the gradient of regularization term in backward_propagation_with_regularization function, the following is different:

  1. (lambd / m * W3)
  2. (lambd / (m*W3))
    (first works fine whereas second gives error with other test cases)
    So I believe it might be about “rounding” etc, but if it does how can we know which version is correct?
    If it is not about that, why is that? Is there something about python’s implementation? Can you help me with this?
    Thanks in advance!

It’s not about rounding. It’s about “order of operations”. The other common term for this is “operator precedence”. This is a pretty big deal to understand. Try the following experiment:

m = 5
x = 1 / 2 * m
y = 1 / (2 * m)
print(f"x = {x}")
print(f"y = {y}")

If you are expecting x and y to have the same value, then you have a fundamental misunderstanding about how the “order of operations” works in python (and just about every other language I’ve ever seen).

The point is that * and / have the same “precedence” in python. Because of that, the expression for x is evaluated from left to right: it simply evaluates the / operation with the operands 1 and 2, giving 0.5. Then it sees the next operator which is * with operands 0.5 and m, so you end up with 2.5. If that’s not what you were intending, then you need to add the parentheses to change the order of evaluation.

This is not a beginning programming class. If the above is a surprise to you, you might want to consider putting these courses on “pause” and taking an “intro to python” course.

1 Like

Or to put this another way, the reason that your two python expressions are different is that 1) says this in standard math notation:

\displaystyle \frac {\lambda}{m} W^{[3]}

Whereas expression 2) means this:

\displaystyle \frac {\lambda}{m W^{[3]}}

So which python expression you use depends on what you are trying to say mathematically. Parentheses matter. :nerd_face:

1 Like

Oh, it is completely silly of me, having barely sleep for the last two days. I thought the formula was (1/(m*W)) so that the problem was about rounding. Thanks for very clear explanation anyway. Seeing my question now, it doesn’t even make sense to me :’)