C3W3 Exercise 3 train Samise model errors, need help

I have a problem for C3W3 Exercise 3 train Samias model part. When I train the model and tested I got errors as screen show:


It can be trained but stop in epoch 1 in total 2 epochs.
Epoch 1/2

TypeError Traceback (most recent call last)
Cell In[24], line 9
3 train_generator = train_dataset.shuffle(len(train_Q1),
4 seed=7,
5 reshuffle_each_iteration=True).batch(batch_size=batch_size)
6 val_generator = val_dataset.shuffle(len(val_Q1),
7 seed=7,
8 reshuffle_each_iteration=True).batch(batch_size=batch_size)
----> 9 model = train_model(Siamese, TripletLoss,text_vectorization,
10 train_generator,
11 val_generator,
12 train_steps=train_steps,)

Cell In[23], line 39, in train_model(Siamese, TripletLoss, text_vectorizer, train_dataset, val_dataset, d_feature, lr, train_steps)
26 model.compile(loss=TripletLossFn,
27 optimizer = tf.keras.optimizers.Adam(lr)
28 )
30 # Train the model
31 # model.fit(train_dataset = train_dataset,
32 # val_dataset = val_dataset
(…)
36 # images, labels = tuple(zip(*dataset))
37 # x, y = tuple(zip(*train_dataset))
β€”> 39 model.fit(train_dataset,
40 epochs = train_steps,
41 validation_data = val_dataset,
42 )
45 ### END CODE HERE ###
47 return model

File /usr/local/lib/python3.8/dist-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.traceback)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
β€”> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb

File /tmp/autograph_generated_file0ez1bfo5.py:15, in outer_factory..inner_factory..tf__train_function(iterator)
13 try:
14 do_return = True
β€”> 15 retval
= ag
_.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False

File /tmp/autograph_generated_file_ih_9j6f.py:25, in outer_factory..inner_factory..tf__TripletLossFn(v1, v2, margin)
23 raise
24 return fscope_1.ret(retval__1, do_return_1)
β€”> 25 sim = ag
.converted_call(ag__.ld(tf).linalg.matmul, (ag__.converted_call(ag__.ld(norm), (ag__.ld(v2),), None, fscope), ag__.converted_call(ag__.ld(norm), (ag__.ld(v1),), None, fscope)), dict(transpose_b=True), fscope)
26 batch_size = ag__.ld(v1).shape[0]
27 sim_ap = ag__.converted_call(ag__.ld(tf).linalg.diag_part, (ag__.ld(sim),), None, fscope)

File /tmp/autograph_generated_file_ih_9j6f.py:20, in outer_factory..inner_factory..tf__TripletLossFn..norm(x)
18 try:
19 do_return_1 = True
β€”> 20 retval__1 = ag
.converted_call(ag__.ld(tf).math.l2_normalize, (ag__.ld(x),), dict(axis=1), fscope_1)
21 except:
22 do_return_1 = False

TypeError: in user code:

File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/training.py", line 1338, in train_function  *
    return step_function(self, iterator)
File "/tmp/ipykernel_14/3543266221.py", line 36, in norm  *
    return tf.math.l2_normalize(x, axis=1) # use tensorflow built in normalization

TypeError: Expected int32 passed to parameter 'y' of op 'Maximum', got 1e-12 of type 'float' instead. Error: Expected int32, but got 1e-12 of type 'float'.

Last time I really suspect it caused by triplet loss function:
I calcuated similarity value from v1 and v2 based on previous lab hints as:

def norm(x):
return tf.math.l2_normalize(x, axis=1) # use tensorflow built in normalization

# v1 = tf.cast(v1, tf.float32)
# v2 = tf.cast(v1, tf.float32)

sim = tf.linalg.matmul(norm(v2), norm(v1),  transpose_b=True)

Any hints for this errors? I need train the model correctly to make following code work.

Follow up: As suggested last time to remove norm(x) function and replaced the code as:
in triplet Loss function with:
sim = tf.linalg.matmul(v2, v1, transpose_b=True)

But Exercise 3 training Samiase model keep same. the model train stopped in the 1st epoch.

The new errors for unnormalized version is:

TypeError: Input β€˜b’ of β€˜MatMul’ Op has type int32 that does not match type float32 of argument β€˜a’.

I think it is a simple type of variable problem, for tf.linalg.matmul function(a, b, transpose_b= True) do not match, any hints?
f

Hi @Zhiyi_Li2

It says:

  • β€˜a’ is of type float32
  • β€˜b’ is of type int32

TensorFlow cannot do β€˜MatMul’ Op because of that (different types).

In other words, it indicates to look why they do not match. It’s hard to suggest anything because almost everywhere in the code they are handled for you (unless it suggests that you changed the code in places that you were not supposed to change). But in general, the advice would be to track down why they are of different type in the first place (are you doing some conversion anywhere).

Cheers

1 Like
  • tf.keras.layers.Lambda: Layer with no weights that applies the function f, which should be specified using a lambda syntax. You will use this layer to apply normalization with the function
    • tfmath.l2_normalize(x)

You have to use a lambda layer here as per the instructions! There is no need to cast V1 and V2 anywhere in the assignment.

Maybe its better you reset the assignment and do it from scratch at this point, of course you can keep your solutions so far to use them again!

1 Like

Training problem is solved, with help from arvyzukai. Thanks.

1 Like