Problem with convolutional_block in Week 2 Assignment (W2A1)

I’m currently working on the convolutional_block function for the ResNet implementation in W2A1. My code passes the shape test, and even the inference mode output matches the expected tensor exactly. However, it fails the final assertion during training mode (training=True).

This is the error message I get:
tf.Tensor(
[[[0. 2.4476275 1.8830043 0.21259236 1.922003 0. ]
[0. 2.1546977 1.6514317 0. 1.7889941 0. ]]

[[0. 1.8540058 1.3404746 0. 1.0688392 0. ]
[0. 1.6571904 1.1809819 0. 0.9483792 0. ]]], shape=(2, 2, 6), dtype=float32)

AssertionError Traceback (most recent call last)
Input In [147], in <cell line: 3>()
1 ### you cannot edit this cell
----> 3 public_tests.convolutional_block_test(convolutional_block)

File /tf/W2A1/public_tests.py:115, in convolutional_block_test(target)
112 tf.keras.backend.set_learning_phase(True)
114 C = target(X, f = 2, filters = [2, 4, 6])
→ 115 assert np.allclose(C.numpy(), convolutional_block_output2), “Wrong values when training=True.”
117 print(‘\033[92mAll tests passed!’)

AssertionError: Wrong values when training=True.

AssertionError: Wrong values when training=True.

Here’s the shape and tensor I’m getting in training mode:

tf.Tensor(
[[[0. 2.4476275 1.8830043 0.21259236 1.922003 0. ]
[0. 2.1546977 1.6514317 0. 1.7889941 0. ]]

[[0. 1.8540058 1.3404746 0. 1.0688392 0. ]
[0. 1.6571904 1.1809819 0. 0.9483792 0. ]]], shape=(2, 2, 6), dtype=float32)

  • Is there any subtle detail I might be missing in the shortcut or batch normalization?
  • Could the problem be related to in-place modifications or the order of layers?
  • Has anyone else encountered this mismatch during training mode?

The most common mistake on this function is how the shortcut path works. The verbal instructions are not very detailed on this point, but the picture is worth the proverbial “a thousand words”. Please compare your code for that to the diagram.

1 Like

I’m facing similar issue. I’ve gone through the description part many times.. still dont understand whats failing. Please let me know if you find any solution

There are a lot of details to get right here: the exact specs on all the convolutions. Then there’s the point I made above about making sure that you use the correct input for the shortcut path. The common mistake is just to use the current X value, but we specifically saved the original X value at the beginning of the function. Please compare how your code works to the diagram of showing how the shortcut path is intended to work.

If that’s not enough to help, then it’s time to look at your code. We can’t do that here on a public thread, but please check your DMs for a message from me about how to proceed with that.