Week 2 - Exercise 2 - convolutional_block - ValueError: Operands could not be broadcast together with shapes (1, 1, 2) (4, 4, 3)

In Week 2’s programming assignment for Convolutional Neural Networks, I’m running into this error. Not quite sure where it’s coming from. I’ve verified filter sizes, strides and padding. I also checked activations and batch normalization configs. Any ideas?

JB

At first reading:

For BatchNormalization, I think the “training” parameter is a boolean, so “training=training” seems wrong.

Also, if it’s a parameter of the Batch Normalization, then it should be inside those parenthesis, not part of (X).

Also it would help if you said which Week of Course 4 this appears in.

Hi Tom,

Thanks for your reply! I’ve updated the title to show that this is in Week 2’s programming assignment.

I copied the BatchNormalization line from their example here:

##### MAIN PATH #####

# First component of main path glorot_uniform(seed=0)
X = Conv2D(filters = F1, kernel_size = 1, strides = (s, s), padding='valid', kernel_initializer = initializer(seed=0))(X)
X = BatchNormalization(axis = 3)(X, training=training)
X = Activation('relu')(X)

I also checked the instructions and it looked the same there as well. Any ideas?

This is the entire error:


ValueError Traceback (most recent call last)
in
8 X = np.concatenate((X1, X2, X3), axis = 0).astype(np.float32)
9
—> 10 A = convolutional_block(X, f = 2, filters = [2, 4, 6], training=False)
11
12 assert type(A) == EagerTensor, “Use only tensorflow and keras functions”

in convolutional_block(X, f, filters, s, training, initializer)
55
56 # Final step: Add shortcut value to main path (Use this order [X, X_shortcut]), and pass it through a RELU activation
—> 57 X = Add()([X, X_shortcut])
58 X = Activation(‘relu’)(X)
59

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
980 with ops.name_scope_v2(name_scope):
981 if not self.built:
→ 982 self._maybe_build(inputs)
983
984 with ops.enable_auto_cast_variables(self._compute_dtype_object):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py in _maybe_build(self, inputs)
2641 # operations.
2642 with tf_utils.maybe_init_scope(self):
→ 2643 self.build(input_shapes) # pylint:disable=not-callable
2644 # We must set also ensure that the layer is marked as built, and the build
2645 # shape is stored since user defined build functions may not be calling

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/tf_utils.py in wrapper(instance, input_shape)
321 if input_shape is not None:
322 input_shape = convert_shapes(input_shape, to_tuples=True)
→ 323 output_shape = fn(instance, input_shape)
324 # Return shapes from fn as TensorShapes.
325 if output_shape is not None:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/merge.py in build(self, input_shape)
110 else:
111 shape = input_shape[i][1:]
→ 112 output_shape = self._compute_elemwise_op_output_shape(output_shape, shape)
113 # If the inputs have different ranks, we have to reshape them
114 # to make them broadcastable.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/merge.py in _compute_elemwise_op_output_shape(self, shape1, shape2)
83 raise ValueError(
84 'Operands could not be broadcast ’
—> 85 'together with shapes ’ + str(shape1) + ’ ’ + str(shape2))
86 output_shape.append(i)
87 return tuple(output_shape)

ValueError: Operands could not be broadcast together with shapes (1, 1, 6) (4, 4, 3)

This is the “conv block” of Residual Networks. Take a closer look at your “shortcut” block code. It is wrong: you’re using X as the input in that step. They aren’t very explicit in the instructions about that, but that is not what was intended. Maybe the best way to “get the picture” is to look at the diagrams for how the shortcut layers are intended to work. Note that they save a value called X_shortcut early in the code. :nerd_face:

2 Likes

Paul,

You are the man! Thank you so much, that was it!

I’ll remove my code in the post now. :slight_smile:

JB

2 Likes

Glad to hear that was the solution! Thanks for removing the source code!

Hey Paul,

I’m currently also having the same problem as the original poster. I’m not sure if the assignment has changed since then, but I’m getting the same error as the original poster.

[source code removed]

I’ll remove source code once the problem’s fixed, but I’m not sure why I can’t just pass X since we added X_shortcut and X in the previous line.

*This is the ResNet programming assignment

edit: I was able to fix this issue with the help of this post: ResNet C4 W2 A1- Ex2 convolutional_block. Error: AssertionError: Wrong values when training=False