I have checked my code several times but to no avail of finding the error.
Greatly appreciate any help, thanks in advance
Below is the Error encountered:
Below is my model code:
{moderator edit: code removed}
I have checked my code several times but to no avail of finding the error.
Greatly appreciate any help, thanks in advance
Below is the Error encountered:
Below is my model code:
{moderator edit: code removed}
Please do not post your code on the forum. That’s not allowed by the course community standard.
I have edited your post.
Check if your identity_block() and convolutional_block() functions are passing their tests.
Interesting. Yes, I think your theory that the problem is in the identity_block
is right. But it’s worse than that:
If you write the step where you add the “shortcut” to X by just using the overloaded “+” operator like this:
X = X + X_shortcut
You pass the tests for the identity block function, but then it explodes and catches fire when you get to the resnet50
section and actually look at the output of the “summary” of the model. This is a case in which there are (at least) two equivalent ways to express the given layer operation, but they don’t look the same in the summary. If you have a more detailed look at the instructions for the identity_block
section, they specifically tell you to use the Add()
function that they imported for you, even if they don’t really explain why.
I tried this experiment and the resnet50
test fails with exactly the same error shown in the OP.
Now the question is whether we would consider this a bug in the unit tests for identity_block
.
The same issue exists in convolutional_block
.
Notice that because they are using this very “literal minded” test methodology in resnet50
, you also will fail if you do the addition with the operands in the wrong order, even though addition is commutative. (Update: this turns out to be a problem only in the conv block, but they gave you that code as part of the template exactly because of this issue.)
I see, thank you so much for the detailed explanation! Although this causes issues in the checking of code for this assignment, is it viable to use the “+” operator or i have to use the Add() method?
Thank you for the reminder, I will take note of that in the future.
From the fact that the tests for the identity_block
function passed when you used “+”, I would conclude that your method is equivalent in terms of the results. Note that the test there is actually checking the numeric output produced by the function. So it’s apparently a way to write equivalent code at least in this instance, but it shows up differently if you print the “summary” of the resulting model.