In the week 3 assignment, during backpropagation, I am getting the correct result, but a few elements in my DW1 matrix are off by only 0.00000001 so the test is coming out as a fail.

How can I fix this?

In the week 3 assignment, during backpropagation, I am getting the correct result, but a few elements in my DW1 matrix are off by only 0.00000001 so the test is coming out as a fail.

How can I fix this?

Interesting. We are doing 64 bit floating point here, so rounding errors are typically a lot smaller than that. On the order of O(10^{-15}). But I don’t remember ever seeing a case like this with errors that small. It’s probably worth looking at your code to understand. We don’t want to do that in a public way, though. I’ll send you a DM about how to do that.

Note that the test which is failing isn’t the one for which those results and expected output are displayed.

The failing test is run in the background as “backward_propagation_test()”. Its results are not displayed, you only get the assert if the value is outside the default ‘allclose’ limit.

That’s a good point that I didn’t notice. The hidden test case is quite different including different dimensions, but it has assertions to check the types and shapes as well as the values of the gradients. And your code only fails the value test, so it’s not some kind of “hard-coding” issue.

But then I remembered that there is a “gotcha” in how those test cases are constructed: in the “hidden” test case, all the cache values are just randomly generated, so there is no relationship between the Z1 value and the A1 value. But what should be true is that:

A1 = tanh(Z1)

The reason this matters is that part of the computation is to compute g'(Z1) and the derivative of tanh is:

g(z) = tanh(z)

g'(z) = 1 - tanh^2(z)

But if you use that formula, you get the wrong answer because of the fact I mentioned before: that Z1 is not related to A1. They assumed you would write the formula this way:

a = tanh(z)

g'(z) = 1 - a^2

And that’s what the grader uses as the “correct” output, so we have to write the derivative using A1 and not tanh(Z1).

The “visible” test case is better in that the Z1 and A1 values have the correct mathematical relationship. So you pass the visible test with the tanh(Z1) formulation, but not the other hidden test. And as mentioned above, the grader will also not accept the tanh(Z1) formulation.

I claim this is a bug in the assignment and I filed it on github quite a while ago, but it has not yet been fixed. Fortunately there is any easy solution from the student’s P.O.V., once this issue has been pointed out.

In fact, there’s an easy way to see that the A1 value is bogus: just print it in your code. If it were the output of tanh, then all the values would satisfy -1 < a < 1, right? But that is clearly not true for the hidden test case.