Initialize Z with zeros or np.random? Weird bug for Week 1 Assignment: Convolutional Model, Step by Step

Wan_Yang · March 31, 2024, 11:52pm

Where?

Exercise 3 - conv_forward function implementation

What happened?

I found a bug in my code that resulted in First Test passed but second Test failed, that is if I initialize the output Z through np.random.randn() instead of np.zeros(). I found this bug by checking my code line by line according to the comments instructions within the function. Once I flip it to using np.zeros() both tests pass. However, I actually don’t understand why it matters? To me, every value of Z will be reset by assigning the output from conv_single_step() so the initial value doesn’t matter but only the shape of Z matters, how come this will cause the Test2 to fail? I find it hard to figure it out myself so decided to post here to see if anyone meet the similar problem.

Again it’s not blocking now, but I’m just so curious to know is there anything happening under the hood?
Any help and discussions are highly appreciated!!

paulinpaloalto · April 1, 2024, 12:34am

This is a really interesting question! I tried your experiment and you’re right: one of the later tests with stride = 1 and pad = 6 fails. At first I was very puzzled: you’re right that the input values of Z all get updated, assuming your implementation is correct. And they aren’t used as input to any other calculations.

But the reason is that calling np.random.randn in the code changes the behavior of the random sequences that you get for the other input values in the test. Every number you generate advances the sequence. The tests all set a particular random “seed” value at the beginning, so that the results will be consistent. But if you add any random calls or make the random calls in a different order, you’ll get a different answer, because the randoms you get end up assigned to different places than they would normally be.

Here’s a way to convince yourself: try initializing Z with np.ones instead of np.zeros. That will prove that the actual Z values don’t matter, but doing it that way does not disturb the behavior of the random calls in the test code and everything passes.

Thanks very much for this question! It’s the most enjoyable question to research that I’ve seen in a long time. A really nice puzzle and I learned something fun in the process.

Wan_Yang · April 1, 2024, 6:03pm

Thank you so much @paulinpaloalto for fast response and detailed explanation! Dry run a example to ensure I get your point correctly:
Once after setting the seed at the beginning of the function, the random generator’s output gets ‘locked’, say to be “1, 2, 3, 4, 5 etc” and the test method is written assuming input is “1, 2, 3”. But if I add a extra call to generate 2 numbers in the middle, then the actual computation input become “3, 4, 5” which leads to test breaks.

Is my above understanding correct?
Again appreciations for your help

paulinpaloalto · April 2, 2024, 1:19am

Yes, your description of why using np.random.randn in your code causes the tests to fail is right. Of course they could have written the tests in such a way that your way of writing the code wouldn’t matter. If you want to go to the next level of detail, have a look at the test code in public_tests.py and you’ll see why it only failed in the final test of about 5 tests: for most of the tests, they set the random seed, initialize the random input values and then call the “function under test”. But in that final test case, they call the function and then initialize some more new parameters and call another test case without resetting the seed. That’s the case that fails, because now the random inputs are different than they expect because of the fact that your code “used up” part of the random sequence.

Topic		Replies	Views
Course 4, Week 1, Assignment 1, Exercise 5 conv_backward Convolutional Neural Networks	3	604	July 9, 2021
W4_A1_Ex2 - Error in notebook evaluation (random initialization) Neural Networks and Deep Learning week-4	5	156	May 3, 2024
Course 2 week 1 random initialization Improving Deep Neural Networks: Hyperparameter tun	3	634	July 30, 2022
Help with Week 1 Programming assignment: initialize Z Convolutional Neural Networks	3	584	June 24, 2021
Conv_forward Week 1 A1 C4 Convolutional Neural Networks	4	548	December 4, 2021

Initialize Z with zeros or np.random? Weird bug for Week 1 Assignment: Convolutional Model, Step by Step

Related topics