No worries.
Unfortunately those links have more to do with a conceptual understanding of YOLO, which I feel I have. I’m currently lacking knowledge on how my intution of the algorithm maps to the code given tf is in many ways still black box like.
I did query chat gpt as to how to ensure the filtered grid cells were also preserved, and this is what they outputted for code. This makes much more sense to me, as they pass the basolute locations of the boxes, I just cant seem to figure out why the current code inthe assignment works, without doing this step below (grid_cell_offsets = tf.gather_nd(grid_offsets, grid_indices) # Retrieve grid cell offsets for each bounding box
).
import tensorflow as tf
# Suppose you have the following tensors:
scores = [...] # Tensor containing confidence scores for each bounding box
bounding_boxes = [...] # Tensor containing bounding box coordinates (x, y, width, height) relative to grid cells
grid_offsets = [...] # Tensor containing the grid cell offsets for each bounding box
# Filter out bounding boxes based on a confidence score threshold
threshold = 0.5
filtered_indices = tf.where(scores >= threshold)
filtered_scores = tf.gather(scores, filtered_indices)
filtered_boxes = tf.gather(bounding_boxes, filtered_indices)
filtered_offsets = tf.gather(grid_offsets, filtered_indices)
# Now, transform the filtered bounding boxes to absolute image coordinates
grid_cell_size = [...] # Size of each grid cell
grid_indices = tf.cast(filtered_boxes[:, :2], tf.int32) # Extract grid indices from bounding box coordinates
grid_cell_offsets = tf.gather_nd(grid_offsets, grid_indices) # Retrieve grid cell offsets for each bounding box
# Compute absolute bounding box coordinates
absolute_x = (grid_indices[:, 0] + filtered_boxes[:, 0]) * grid_cell_size + grid_cell_offsets[:, 0]
absolute_y = (grid_indices[:, 1] + filtered_boxes[:, 1]) * grid_cell_size + grid_cell_offsets[:, 1]
absolute_width = filtered_boxes[:, 2] * grid_cell_size
absolute_height = filtered_boxes[:, 3] * grid_cell_size
absolute_boxes = tf.stack([absolute_x, absolute_y, absolute_width, absolute_height], axis=-1)
# Apply non-maximum suppression (NMS)
selected_indices = tf.image.non_max_suppression(absolute_boxes, filtered_scores, max_output_size=100, iou_threshold=0.5)
# Retrieve selected bounding boxes and scores after NMS
selected_boxes = tf.gather(absolute_boxes, selected_indices)
selected_scores = tf.gather(filtered_scores, selected_indices)
# Now, you have the selected bounding boxes and scores after NMS
Note in the above code classes aren’t being considered in the iou step - it seems NMS doesn’t actually do this.