NLP C3 W2 Assignment model training failure

yoojioh · March 7, 2024, 6:30am

When creating a post, please add:

Week # must be added in the tags option of the post.
Link to the classroom item you are referring to: Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera
Description (include relevant info but please do not post solution code or your entire notebook)

Hi,

I am passing all the unit tests but getting a failure when training the model.
The provided code for model.fit():

tf.keras.utils.set_random_seed(33) ## Setting again a random seed to ensure reproducibility

BATCH_SIZE = 64

model.fit(train_dataset.batch(BATCH_SIZE),
          validation_data = val_dataset.batch(BATCH_SIZE),
          shuffle=True,
          epochs = 2)

The error:

Epoch 1/2
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[114], line 5
      1 tf.keras.utils.set_random_seed(33) ## Setting again a random seed to ensure reproducibility
      3 BATCH_SIZE = 64
----> 5 model.fit(train_dataset.batch(BATCH_SIZE),
      6           validation_data = val_dataset.batch(BATCH_SIZE),
      7           shuffle=True,
      8           epochs = 2)

File /usr/local/lib/python3.8/dist-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File /tmp/__autograph_generated_fileuzq8fx1k.py:15, in outer_factory.<locals>.inner_factory.<locals>.tf__train_function(iterator)
     13 try:
     14     do_return = True
---> 15     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16 except:
     17     do_return = False

ValueError: in user code:

    File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/training.py", line 1338, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/training.py", line 1322, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/training.py", line 1303, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/training.py", line 1085, in train_step
        return self.compute_metrics(x, y, y_pred, sample_weight)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/training.py", line 1179, in compute_metrics
        self.compiled_metrics.update_state(y, y_pred, sample_weight)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/engine/compile_utils.py", line 605, in update_state
        metric_obj.update_state(y_t, y_p, sample_weight=mask)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/utils/metrics_utils.py", line 77, in decorated
        update_op = update_state_fn(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/metrics/base_metric.py", line 140, in update_state_fn
        return ag_update_state(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/metrics/base_metric.py", line 728, in update_state  **
        return super().update_state(matches, sample_weight=sample_weight)
    File "/usr/local/lib/python3.8/dist-packages/keras/src/metrics/base_metric.py", line 504, in update_state
        ) = losses_utils.squeeze_or_expand_dimensions(
    File "/usr/local/lib/python3.8/dist-packages/keras/src/utils/losses_utils.py", line 224, in squeeze_or_expand_dimensions
        sample_weight = tf.squeeze(sample_weight, [-1])

    ValueError: Can not squeeze dim[1], expected a dimension of 1, got 104 for '{{node Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1]](Cast_7)' with input shapes: [?,104].

BTW, I’m getting 0/10 points for the masked_accuracy test when submitted the notebook, so maybe I need to fix it? I’m passing all the given unit tests though. If there’s any guidance on this, I would really appreciate it!

I

yoojioh · March 7, 2024, 6:48am

I was putting a wrong argument for tf.reduce_sum in the masked_accuracy, and it solved the both model training issue & the masked_accuracy points.

Topic		Replies	Views
W2 graded assignment - training the model fails although all previous tests pass NLP with Sequence Models week-2	1	474	January 25, 2024
C3W3_Assignment - train_model NLP with Sequence Models week-3	2	472	February 1, 2024
C3W1 Assignment 1: Deep N-grams NLP with Sequence Models week-1	1	931	January 15, 2024
C3W2 Assignment (Named Entity Recognition) - model.fit crashes due to shapes mismatch NLP with Sequence Models week-2	2	158	May 22, 2024
C3W2_Compile error during grading NLP with Sequence Models week-2	6	479	February 28, 2024

NLP C3 W2 Assignment model training failure

Related topics