referring to this lab notebook:https://www.coursera.org/learn/advanced-computer-vision-with-tensorflow/supplement/wbql7/eager-few-shot-object-detection
In the Eager Few Shot Object Detection Colab, the rubber ducky class is initially assigned ID 1
and then shifted to 0
using label_id_offset
.
label_id_offset = 1
train_image_tensors = []
gt_classes_one_hot_tensors = []
gt_box_tensors = []
for (train_image_np, gt_box_np) in zip(
train_images_np, gt_boxes):
train_image_tensors.append(tf.expand_dims(tf.convert_to_tensor(
train_image_np, dtype=tf.float32), axis=0))
gt_box_tensors.append(tf.convert_to_tensor(gt_box_np, dtype=tf.float32))
zero_indexed_groundtruth_classes = tf.convert_to_tensor(
np.ones(shape=[gt_box_np.shape[0]], dtype=np.int32) - label_id_offset)
gt_classes_one_hot_tensors.append(tf.one_hot(
zero_indexed_groundtruth_classes, num_classes))
print('Done prepping data.')
My understanding is that most object detection models expect non-background class IDs to start from 0
. Wouldn’t it be more straightforward to assign duck_class_id = 0
from the beginning, eliminating the need for the extra shift?
Is there a specific reason why the rubber ducky class is initially assigned ID 1
and then shifted to 0
? Is it to follow a different convention, avoid confusion with the background class, or due to some other consideration?
Any insights you can provide on this would be greatly appreciated!