COurse 4 week 2 exercise 3 Resnet 50

i am getting this error in identity_block:


even when i use the correct stride and padding values. Can you please let me know where could be the problem?
thanks

This whole assignment is an exercise in proofreading. There are lots of details in the instructions and you need to get them all right. The best I can suggest is to go over everything again very carefully. Read the instructions and compare to what your code actually does. Make sure to read the instructions in the comments as well. One common mistake is not to get the BatchNorm layers correct. They give you one example in the template code and yours should look the same other than perhaps the input variable name being passed.

Thank you, the kernel_size for second step should be f, i did not understand it first, then i correct it. now it passed.

1 Like

Great! That’s what I meant by this assignment being an excruciating exercise in proofreading. :laughing: Glad to hear you figured out the issue.

my cells stops in the middle of running epochs and says “Dead kernel”


what should i do

I haven’t personally run into that issue, so I can only suggest some theories. One would be that you’ve got some extra print statements in your code that causes the memory image of the notebook to get huge as you run lots of iterations. If that’s not it, maybe it would help to see a screenshot of what the cell looks like that is running the training when the failure happens.

I get an error in identity_block at the Add() step:

I checked and re-checked my code several times especially the Add() and BatchNormalization() functions, and it all seems correct to me.
I restarted the kernel a couple of times.
Please help.

(a screenshot of my error is given in the above post)

my identity block looks like this:

{moderator edit - solution code removed}


please give me a clue - it’s failing in the Add() line and says that it cannot broadcast with the shapes of X and X_shortcut

The parameters you have specified for the second Conv2D layer do not agree with the instructions. This whole assignment is an excruciating exercise in proof-reading. Please compare your code to the instructions with a little more care.

@paulinpaloalto Thank you.
I fixed the issue and submitted the assignment successfully.
I now remember kernel and filter refer to the same thing.

Right! And neither of them is equivalent to stride. :nerd_face:

Hi,
I had some trouble making this code work, and finally made it trough reading others people mistakes till I found mine. [second step filter size].
My personal feeling after this exercise is mainly frustration.

I agree with you that the assignment

being an excruciating exercise in proofreading.

However my question is - why is that?
from your point of view - what does a student learn from such an exercise?

I would like to suggest an idea, I hope it will be an improvement.

Instead of letting us proofread once and again, maybe explain better about debugging tools for TF model and make the exercise to debug a broken model(s), and maybe “rules” where to look for and find the problem in a more systematic way. Similar to the way Andrew explained about matrix sizes in numpy. At the end of the day I got debugging knowledge for numpy but the real work is done in TF, but in TF all I did was to read proof.

Comment: I had a different error in the model but it takes time to understand what is the difference the test finds - since the model is big I tried to look for a way to name each action e.g. like in tf.identity  |  TensorFlow v2.14.0 I tried to do

tf.identity(    ..., name="stage51")

but I got syntax error. Funny enough after removing the name and rerunning the cell the original problem I was trying to fix disappeared, so I a actually don’t learned nothing since I don’t know what was the original mistake.

Thanks for the careful feedback. Sorry that you found this assignment frustrating. Well, at one level all programming requires careful proofreading, right? A single character wrong can ruin everything. I recently saw a case where a student subtracted instead of adding in one expression and was distracted by other aspects of the code thinking the error was something complicated and missed the difference between - and +. Many many hours were wasted.

In terms of what the pedagogical value of this assignment is, this is by far the most complex network architecture that we have seen so far and it’s also the most complex problem we’ve tried to solve with TF so far. So it’s useful at that level to get experience applying TF to a complicated problem and seeing what kinds of combinations of layers are possible and useful.

As to the problem with the “name” argument, note that they do a kind of shortcut for the grading and testing things in the notebook: they basically do a fancy “string compare” between the “summary” of your model versus the expected value. So if you do anything that changes in any way the output of summary, even if it doesn’t change the semantics of the layer, it fails the test.

The other thing about errors mysteriously disappearing is typically a case of not being fully aware of the current state of the notebook. E.g. if you just type changes to the code in a function cell and then call that function again, it does nothing: it just runs the old code. You have to actually click “Shift-Enter” or do “Cell → Run All Above” to get the cell to be recompiled and the new code activated. You can easily demonstrate this effect to yourself now that you know about it. Try some experiments and watch what happens.

I think I didn’t explained myself well.
I totally agree that

it’s useful at that level to get experience applying TF to a complicated problem and seeing what kinds of combinations of layers are possible and useful.

My suggestion was that instead of getting this to work by read proofing all the details that the student assignment will be to fix a broken solution with all the mistakes mentioned above in this thread.
Another alternative to this which is more in line with the current assignment is to add a dedicated public test to each of the known mistakes from the thread (e.g. filter size of each layer) and to add an hint to the text of the test failure.
if this makes sense, I might do this as an exercise and post here, later on (currently, I am in a rush to gain some AI knowledge, so it will probably be delayed).

I know from pedagogical POV it is a debate how easy learning should be since painful lessons are better for the long run, but all this course attitude is to ease learning, except of this exercise that put the difficulty in read proofing.

Those are all good points. I have often thought that perhaps they go a bit too far in terms of trying to make the programming part of things not scary: they almost spoon feed you the solutions so a lot of times if feels more like “taking dictation” instead of real programming. With that thought in mind, then you’re right that this assignment might seem like a violation of the expectations they have previously set. Although now that I think a little more about it, the two assignments in C4 W1 are non-trivial and also require some careful thought and coding. Getting the stride logic correct in conv_forward and pool_forward takes some careful attention and conceptual understanding.

I think the suggestion of more complete test cases sounds the best way to go to me. But given the level of detail here and the number of layers and parameters, that’s going to be a serious amount of work to do it at the level you describe. Maybe the compromise would be to do something like the overall “comparator” test that they have, but do it on each layer individually. That way they could keep the work within bounds and still be able to give a more specific error like: there’s a mistake in the parameters for the second Conv layer. That at least would give the student a pretty specific place to look for the error. I’ll file an enhancement request with the course staff and see what they think about that.

Thanks again for the careful and thoughtful feedback!

Don’t forget to mention i am willing to do it for them in the mid future … :slight_smile:

בתאריך יום ג׳, 14 בפבר׳ 2023, 18:24, מאת Paul Mielke via DeepLearning.AI ‏<notifications@dlai.discoursemail.com>:

Thanks! I will mention that or put a link to this thread in the enhancement request.

I think the issue with the need for proofreading this assignment is that it’s maybe too complicated of an example for effective teaching.

Maybe a smaller example would be good for teaching, then the notebook could demonstrate the complete complicated system as an example of extending the technique.