Residual Networks (assignment) test accuracy not matched

I get “All tests passes!” for every graded function in week 2 assignment 1. However, when I run the code:

preds = model.evaluate(X_test, Y_test)
print ("Loss = " + str(preds[0]))
print ("Test Accuracy = " + str(preds[1]))

I get 57% accuracy as opposed to >80. I went through all of my implementations and everything seems fine. Do you have a tip for me to find where I may have made the mistake? As all of the graded function gives me “All tests passed!” it is extremely hard to find where the problem is.

Hey @zav if you believe your implementations as correct, restart your assignment and then try again to see if you get >80%.

Let me know how it goes.

1 Like

There may be a typo somewhere in one of the functions but it seems hard to detect it. Do you know why do I get “All tests passed!” although there is clearly a mistake somewhere? Doing everything from scratch again sounds a little painful :(. But if there is no any other way, I may have to do it.

There could be a typo, yes. But I remember this assignment being random. Depends on the training actually.

Like once I got around 63%, then another time it was In 70s. Another time in 90s.
So like I said, if you believe your implementations is correct, first you should submit to see if you pass. If you don’t, then yes, some implementation would be incorrect.

But if you pass, then you can try running the assignment a few times (restarting the kernel every time) to check your results.

2 Likes

Oh, you were right! I did pass the assignment with 100% even though the numbers were not matching. It took me a few hours to check my implementations many times :). I wish I submitted earlier. Oh well. Thanks for the tips!

1 Like

One thing I have found that both the “unit tests” in the notebook and the grader misses are omitting the “training” parameter on the BatchNorm calls in the convolutional_block. Leaving that out will change the results. They give you an example of what the BatchNorm call should look like in the template code. Please compare yours with that instance and see if it is the same or not.

In other words, just because the grader gives you 100% does not mean your code is really correct. @Mubsi, I’ve already filed a GitIssue about this case.

3 Likes

Awesome. Thanks, @paulinpaloalto ! Really appreciate it.

I had the same issue: test accuracy ~57%. Then I closed out and ran everything again (with no changes to the code)… and test accuracy was ~82%. I’d love to know why that happens.

1 Like

Every time you close and reopen one of the notebooks, you need to run all the cells in order. If you run them out of order, the runtime state of the notebook is unpredictable. It’s misleading in that you see the outputs of previous runs when you reopen a notebook, but you need to be aware of the fact that the “runtime” state in memory no longer exists and must be recreated.

The other possible effect here is that just changing the code in a cell doesn’t really do anything until you re-execute that cell. Running the test cell that calls your modified function does nothing and will execute the previous version of the function. In other words “what you see is not what you get”. You can demonstrate this effect pretty easily yourself. Try this: type a syntax error into one of your functions that used to work. Now execute the test cell that calls that function. Lo and behold it still works. Now click “Shift Enter” to execute the actual function cell with the error and then run the test cell again.

2 Likes