W2 Residual_Networks programing

Hi, like many others before me, I have been struggling with the programming assignment.
Honestly, I don’t understand why they would ask us to write code without even teaching us the syntax or the algorithms for shortcuts, I have been guessing and looking for answers in the community for two hours now.

Anyway, If I assign the third component of the path to X_shortcut I get this error message:

ValueError: Inputs have incompatible shapes. Received shapes (4, 4, 4) and (4, 4, 3)

and it makes sense because f is 4, so I’m using four filters in step 1. but then I don’t understand how to change back the channels to 3.

If I don’t assign x_shorcut, I get a different validation error:

Check the padding and strides

I think the code is ok, I have the right strides, training is set as a parameter, kernel_size is set to f and padding is set to the same only for the second component, I have batchnorm axis set to 3 for all components,

So what might be wrong?

1 Like

The best way to see how the shortcuts work is to examine the diagrams. Pictures are more expressive than words in some cases and this seems to be one such case.

Not sure what you mean by that. The shortcut value is one of the two inputs to the “Add” step that they describe in the comments. At least if you are talking about the identity block. It’s a little more complicated in the conv block case.

This whole assignment is an exercise in carefully following detailed instructions. So the first step is to take a couple of calming breaths and then read over the instructions again carefully, including the embedded comments in the template code.

It might also help to know whether you are talking about the identity_block or the convolutional_block functions, but the mechanics are pretty similar.