C3W3_Assignment Question duplicates Siamese fcn error for TextVectorization

The course update to NLP with Sequence Models removed a week from the online course materials–I’d finished the first 3 weeks and was working on the 4th week materials when the 4th week got renamed Week 3 as seen in the topic. My question is about the assignment “Question Duplicates” in the Siamese Networks week.

In GRADED FUNCTION Siamese, the earlier defined TextVectorization is to be used as “text_vectorizer”.

I followed the in-function guidance to define the input1 and input2 that was stated as “Be mindful of the data type and size”: per the TF document tf.keras.Input  |  TensorFlow v2.14.0 the shape was set to the depth of the model and the type for the Input followed the instructions for tf.keras.layers.Input appearing just before the GRADED FUNCTION.

However, I don’t understand why the following error message is generated AND what to do to resolve the error. Setting the shape to various other “reasonable” values generated worse error messages. I feel like I want to use the tf.squeeze function somewhere in Siamese but no location tried resolved the error.

I did get the input1 into branch1 and input2 into branch2–is it not that simple to call branch() with the respective inputs? Is that the source of the error?

From the instructions of tf.keras.layers.Input which appear before the GRADED FUNCTION: “Remember to set correctly the dimension and type of the input, which are batches of questions. For this, keep in mind that each question is a single string.”

I mistakenly read the TF documentation to mean that the shape was the depth of the model…oh well :slight_smile:

Hi. I’m encountering the same issue, and unfortunately I can’t seem to solve it (or find the solution here)… To be clear: what should be the input shape? shape=(d_feature, )? Something else? Thanks

For this, keep in mind that each question is a single string or look at the expected output shape for input_1

Thank you @Ankoor_Bhagat , however this still doesn’t work for me and I can’t figure out why. I followed closely C3W3_Siamese_Network.ipynb, and I really need some help with this. Can anyone suggest a correction to the following?

GRADED FUNCTION: Siamese

def Siamese(text_vectorizer, vocab_size=36224, d_feature=128):
“”"Returns a Siamese model.

Args:
    text_vectorizer (TextVectorization): TextVectorization instance, already adapted to your training data.
    vocab_size (int, optional): Length of the vocabulary. Defaults to 56400.
    d_model (int, optional): Depth of the model. Defaults to 128.
    
Returns:
    tf.model.Model: A Siamese model. 

"""
### START CODE HERE ###

branch = tf.keras.models.Sequential(name='sequential') 
# Add the text_vectorizer layer. This is the text_vectorizer you instantiated and trained before 
branch.add(text_vectorizer)
# Add the Embedding layer. Remember to call it 'embedding' using the parameter `name`
branch.add(tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=d_feature, name='embedding'))  # +1?
# Add the LSTM layer, recall from W2 that you want to the LSTM layer to return sequences, ot just one value. 
# Remember to call it 'LSTM' using the parameter `name`
branch.add(tf.keras.layers.LSTM(units=d_feature, return_sequences=True, name='LSTM'))
# Add the GlobalAveragePooling1D layer. Remember to call it 'mean' using the parameter `name`
branch.add(tf.keras.layers.GlobalAveragePooling1D(name='mean'))
# Add the normalizing layer using the Lambda function. Remember to call it 'out' using the parameter `name`
branch.add(tf.keras.layers.Lambda(lambda x: tf.math.l2_normalize(x), name='out'))

# Define both inputs. Remember to call then 'input_1' and 'input_2' using the `name` parameter. 
# Be mindful of the data type and size
input1 = tf.keras.layers.Input(shape=(None, ), dtype=tf.int64, name='input_1')
input2 = tf.keras.layers.Input(shape=(None, ), dtype=tf.int64, name='input_2')
# Define the output of each branch of your Siamese network. Remember that both branches have the same coefficients, 
# but they each receive different inputs.
branch1 = branch(input1)
branch2 = branch(input2)
# Define the Concatenate layer. You should concatenate columns, you can fix this using the `axis`parameter. 
# This layer is applied over the outputs of each branch of the Siamese network
conc = tf.keras.layers.Concatenate(axis=1, name='conc_1_2')((branch1, branch2)) 

### END CODE HERE ###

return tf.keras.models.Model(inputs=[input1, input2], outputs=conc, name="SiameseModel")

keep in mind that each question is a single string.
shape=(1,)
Also, “dtype=tf.int64” should be “dtype=tf.string”. Otherwise, you will have a problem when you run Exercise 03.

3 Likes

Thank you so much @ejint !! Now it works like a charm… One more related note: in order to pass w3_unittest.test_Siamese(Siamese) I also had to change round brackets to square ones, unlike C3W3_Siamese_Network.ipynb. The correct syntax seems to be: conc = tf.keras.layers.Concatenate(axis=1, name=‘conc_1_2’)([branch1, branch2])

Hope this helps.

Hi, may I ask what was the correct input of the shape for tf.keras.layers.Input()? I put (1,) in mine it still doesn’t work and showed error as follow: File /usr/local/lib/python3.8/dist-packages/keras/src/engine/input_spec.py:235, in assert_input_compatibility(input_spec, inputs, layer_name)
233 ndim = shape.rank
234 if ndim != spec.ndim:
→ 235 raise ValueError(
236 f’Input {input_index} of layer “{layer_name}” ’
237 “is incompatible with the layer: "
238 f"expected ndim={spec.ndim}, found ndim={ndim}. "
239 f"Full shape received: {tuple(shape)}”
240 )
241 if spec.max_ndim is not None:
242 ndim = x.shape.rank

ValueError: Input 0 of layer “mean” is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)

Hi. In my code, following this thread, I wrote the this code - and it worked:
input1 = tf.keras.layers.Input(shape=(1, ), dtype=tf.string, name=‘input_1’)
I hope this helps.

Hi just sharing a quick comments all the above help me fixe my own bug , one other hint might be to double check you re using GlobalAveragePooling1D and not the one from previous lab earlier in the week.