All of my code cells through " 4 - Building Your First ResNet Model (50 layers)" have successfully run, and, where relevant, all tests have passed. However when I run the cell that tests the code with:
from outputs import ResNet50_summary
model = ResNet50(input_shape = (64, 64, 3), classes = 6)
comparator(summary(model), ResNet50_summary)
I’m getting the error:
Test failed
Expected value
[‘Conv2D’, (None, 8, 8, 512), 66048, ‘valid’, ‘linear’, ‘GlorotUniform’]
does not match the input value:
[‘Conv2D’, (None, 8, 8, 512), 131584, ‘valid’, ‘linear’, ‘GlorotUniform’]
I’ve gone as far as scrolling through the output of the model.summary(), and I found a line that appears to have the same number of parameters as the input value mentioned in the error, i.e.:
conv2d_38 (Conv2D) (None, 8, 8, 512) 131584 [‘activation_30[0][0]’]
but even with this I haven’t been able to correlate the error info to where the issue might be in my code.
This suggest that the issue is either in, or immediately prior to, the shortcut layer in the convolution block. Using pseudocode, that layer is defined as:
X_shortcut = Conv2D with F3 layers, kernel size of 1, stride of s, padding of ‘valid’, and is being applied to X_shortcut (which, up until then, is just a copy of input X, per the part of the code that is given). In the following line, X_shortcut is set to the batch normalization of X_shortcut.
I found the issue. In the second to last step of convolution_block, where X and X_shortcut are added together, I had the two parameters to Add() in the “wrong” order, i.e. instead of (pseudocode)
Add(X, X_shortcut)
I had
Add(X_shortcut, X)
I don’t see anything in the documentation of Add() that suggests that the order should matter, and the model compiles and trains with the parameters in either order. So perhaps this is just something that the test code needs to be able to handle?
Until then, hopefully this info will be of use to someone.
That was good detective work figuring out the issue. Congrats!
Addition is commutative, of course, so the result is the same. But it changes the order of the compute graph and the test case there does a direct compare of the summaries of the two compute graphs.
But notice that they just gave you that Add() code in the template for convolutional_block for exactly this reason. You must have modified the given code in order to hit that problem.
If you change the order of the operands in the Add in identity_block, it doesn’t matter and all the tests still pass. But if you change the order in convolutional_block, then the test fails for ResNet50 in the way you showed in your original post.
Test failed
Expected value
['Conv2D', (None, 8, 8, 512), 66048, 'valid', 'linear', 'GlorotUniform']
does not match the input value:
['Conv2D', (None, 8, 8, 512), 131584, 'valid', 'linear', 'GlorotUniform']