[DLS1] Week 3 - exercise 6: error by 1e-8

paulinpaloalto · July 2, 2022, 6:45pm

We are working in IEEE 754 floating point here. By default most things end up being 64 bit float values in numpy. That means there are literally a finite number of values that we can represent, as opposed to the abstract beauty of \mathbb{R}. That means we can’t exactly represent the true mathematical value of even an expression as simple as \frac {1}{7}.

Here’s a chunk of code to do a little experiment with one case in which there are two obvious ways to write an algorithm in python that give different numeric results:

np.random.seed(42)
A = np.random.rand(3,5)
print(f"A.dtype = {A.dtype}")
print(f"A = {A}")
m = 7.
B = 1./m * A
C = A/m
D = (B == C)
print(f"D = {D}")

Here’s what I get when I run that:

A.dtype = float64
A = [[0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]
 [0.15599452 0.05808361 0.86617615 0.60111501 0.70807258]
 [0.02058449 0.96990985 0.83244264 0.21233911 0.18182497]]
D = [[False  True False False  True]
 [ True False False False  True]
 [False  True  True False False]]

So you can see that when we test for exact equality between B and C, quite a few of the elements end up being different. But it turns out that the differences are pretty small. Here’s a chunk of code to explore that:

diff = (B - C)
diffNorm = np.linalg.norm(diff)
print(f"diffNorm = {diffNorm}")
print(f"diff = {diff}")

When I run that, here’s what we see:

diffNorm = 2.9082498092558215e-17
diff = [[-6.93889390e-18  0.00000000e+00 -1.38777878e-17 -1.38777878e-17
   0.00000000e+00]
 [ 0.00000000e+00 -1.73472348e-18 -1.38777878e-17 -1.38777878e-17
   0.00000000e+00]
 [-4.33680869e-19  0.00000000e+00  0.00000000e+00 -3.46944695e-18
  -3.46944695e-18]]

So you can see that all the non-zero differences are in the range 10^{-19} to 10^{-17}. Now this is in 64 bit floating point. The resolution of 32 bit floating point is much lower, of course, so the errors would be larger if we ran this same experiment in the 32 bit case.

There is a whole branch of mathematics called Numerical Analysis which studies (among other things) how to manage this type of approximation error. It turns out that some algorithms are what they call “stable”, meaning that the errors caused by finite representations stay relatively small and don’t “compound”. But there can be algorithms which do not have that well behaved property and are “unstable”: if you run them for many iterations the errors can accumulate instead of balancing out and actually cause problems. In most of the cases we will run into in ML/DL, the underlying packages like TF have been written such that the algorithms are numerically stable and we don’t have to worry about it.

Given the above example, you may well wonder how the grader can check our answers if such a simple difference in the code can result in different values. It turns out that numpy provides functions for comparison that allow you to specify a “close enough” threshold. Have a look at the documentation for numpy isclose and numpy allclose.

Topic		Replies	Views
Week 1 Course 1 Programming Assignment Neural Networks and Deep Learning	2	573	July 6, 2021
W3_Ex-6_AssertionError_Wrong values for dW1 Neural Networks and Deep Learning	10	491	May 26, 2024
Week 3 Programming - Exercise 6 Back Prop Neural Networks and Deep Learning	3	545	May 17, 2022
W3_A1_Ex-6_Getting values wrong for dW2 Neural Networks and Deep Learning	2	511	September 1, 2022
Course 1 Week3 Assignment problems with back-propagation Neural Networks and Deep Learning	9	639	July 26, 2021

[DLS1] Week 3 - exercise 6: error by 1e-8

Related topics