Clarification on Rubber Ducky Class ID Assignment in Eager Few Shot Object Detection Colab

referring to this lab notebook:https://www.coursera.org/learn/advanced-computer-vision-with-tensorflow/supplement/wbql7/eager-few-shot-object-detection

In the Eager Few Shot Object Detection Colab, the rubber ducky class is initially assigned ID 1 and then shifted to 0 using label_id_offset.

label_id_offset = 1
train_image_tensors = []
gt_classes_one_hot_tensors = []
gt_box_tensors = []
for (train_image_np, gt_box_np) in zip(
    train_images_np, gt_boxes):
  train_image_tensors.append(tf.expand_dims(tf.convert_to_tensor(
      train_image_np, dtype=tf.float32), axis=0))
  gt_box_tensors.append(tf.convert_to_tensor(gt_box_np, dtype=tf.float32))
  zero_indexed_groundtruth_classes = tf.convert_to_tensor(
      np.ones(shape=[gt_box_np.shape[0]], dtype=np.int32) - label_id_offset)
  gt_classes_one_hot_tensors.append(tf.one_hot(
      zero_indexed_groundtruth_classes, num_classes))
print('Done prepping data.')

My understanding is that most object detection models expect non-background class IDs to start from 0. Wouldn’t it be more straightforward to assign duck_class_id = 0 from the beginning, eliminating the need for the extra shift?

Is there a specific reason why the rubber ducky class is initially assigned ID 1 and then shifted to 0? Is it to follow a different convention, avoid confusion with the background class, or due to some other consideration?

Any insights you can provide on this would be greatly appreciated!

Hi @niramay,
I believe there is a convention that the background class is id 0, which is what this comment in the code is referring to:

# By convention, our non-background classes start counting at 1.  Given
# that we will be predicting just one class, we will therefore assign it a
# `class id` of 1.
duck_class_id = 1
num_classes = 1

I agree that for this exercise, it would have been simpler to just assign the duck class to id 0, since we aren’t using the background class (or any other classes), but I expect the author of the colab was trying to give a sense of what this might look like in a more realistic use case with additional classes.

Okay got it! Thanks for clarifying