Course 5 Week 4: Named-Entity Recognition notebook Accuracy Metric maybe wrong

I just discovered that the metrics used in model.fit(…), which is metrics=[‘accuracy’], may not be right for this setup. The reason is that token inputs and labels are both being padded, and a “dummy” target label of -100 is used wherever it was a [PAD]. I think the custom loss inside the model detect these -100 and zeros them from calculating the cost (and thus stop any gradient backprop). But the metric is just the vanilla tf keras metrics, and I don’t find any indication it is “smart” enough to ignore those -100 during its computation.

Note that while the Resume dataset are all pretty long and one may not have too many -100s to notice a problem, I did try this out on my own dataset of much shorter sentences, and debugged the poor accuracy even for a very tiny train set. I have come to suspect the wrong accuracy calculation is a big part of this.

I haven’t completed debugged this, and thought to post this, such that the mentors and course instructors can help if this is indeed a mistake.

Thanks for raising this issue.

For those interested, here’s a candidate replacement for the modified accuracy:

call it with:

masked_accuracy = SparseCategoricalAccuracyWithMasking(name=‘accuracy’)

model.compile(optimizer=optimizer, loss=model.compute_loss, metrics=[masked_accuracy])

I haven’t thoroughly tested this, if anyone see an issue, please let me know. It seemed to work for me so far.

class SparseCategoricalAccuracyWithMasking(tf.keras.metrics.Metric):
  def __init__(self, name='sparse_categorical_accuracy', **kargs):
    super(SparseCategoricalAccuracyWithMasking, self).__init__(name=name, **kargs)
    self.num_correct = self.add_weight(name='nc', initializer="zeros")
    self.num_sample = self.add_weight(name='ns', initializer="zeros")

  def update_state(self, y_true, y_pred, sample_weight=None):

    values = tf.cast(tf.equal(y_true, tf.argmax(y_pred, axis=-1, output_type=y_true.dtype)), "float32")   # y == y_pred
    mask = tf.cast(y_true != -100, "float32")
    values = tf.multiply(values, mask)    # zeros the -100

    if sample_weight is not None:
      assert False, "Not handling sample_weight in this implemenetation."

    self.num_correct.assign_add(tf.reduce_sum(values))

    num_sample = tf.reduce_sum(mask)
    self.num_sample.assign_add(num_sample)

  def result(self):
    return self.num_correct/self.num_sample

  def reset_state(self):
    self.num_correct.assign(0.0)
    self.num_sample.assign(0.0)