Non_max_suppression() over all classes?

vuqpham · April 25, 2021, 3:41am

Question for the assignment: Autonomous_driving_application_Car_detection

If I understand correctly, the variables boxes, scores, and classes returned from yolo_filter_boxes are for all possible classes, i.e. boxes contains box coordinates of different classes, and scores contains scores of different classes, respectively. We use classes to know which class a box/score is for.

When we call tf.image.non_max_suppression( boxes, scores, … ) its parameters do not have classes, how can that function differentiates boxes of different classes ? We should suppress boxes of the same class only, right ?

Thanks,
Vu

reinoudbosch · April 25, 2021, 5:05pm

Hi vuqpham,

Thanks for your excellent question.

The call to tf.image.non_max_suppression takes max_boxes_tensor as a parameter. This tensor is passed to max_output_size in tf.image.non_max_suppression.

The documentation at tf.image.non_max_suppression | TensorFlow Core v2.4.1 states the following:

max_output_size: A scalar integer Tensor representing the maximum number of boxes to be selected by non-max suppression.

This is irrespective of class. So, the total number of indexes of boxes for a picture returned by the call to tf.image.non_max_suppression is max_output_size regardless of which class they belong to. In the meantime, overlapping boxes are removed according to iou_threshold.

In other words, the call to tf.image.non_max_suppression does the following: it removes overlapping boxes and returns the max_boxes number of indexes of boxes irrespective of which class they belong to. There can be multiple objects of the same class! The classes of the boxes are next determined based on the returned indexes through tf.gather.

I hope this clarifies things.

vuqpham · April 25, 2021, 10:13pm

Hi Reinoud,

Thank you very much for your explanation. That confirms my thought.

That brings up my next question: assuming that anchor 1 of one cell has a box with prob 0.9 for a car , and anchor 2 of the same cell has another box with prob 0.85 for a passenger, and assuming that these two boxes have their IOU 0.7. This may be the case for the example in the lecture. Now, if we run the non_max_suppression on all boxes of different classes as in the assignment, then the box in anchor2 will be suppressed, while it is a valid one.

I understand that the assignment is not a real application, and just want to know whether in reality, the correct solution should be to gather boxes of each class together and run NMS on each of those set of boxes separately. As in the assignment, it should run 80 times for 80 classes. Is that right ?

Again, thank you very much for your help.

Vu

reinoudbosch · April 25, 2021, 11:54pm

Dear Vu,

Thanks for your reply.

Yes, an actual implementation of yolo predicts all bounding boxes across all classes (which in the case of the assignment would require doing what you suggest). Here’s a good overview of current systems with links to important articles that describe approaches taken:

https://towardsdatascience.com/yolo-v4-or-yolo-v5-or-pp-yolo-dad8e40f7109

vuqpham · April 26, 2021, 12:22am

Thank you very much Reinoud.

Topic		Replies	Views
Doubt about non-max suppression in YOLO Convolutional Neural Networks	2	509	April 14, 2023
Question about the newest implementation of tf.image.non_max_suppression Convolutional Neural Networks	1	440	July 7, 2023
C4W3A1 - yolo_non_max_suppression Convolutional Neural Networks	2	539	April 22, 2023
Course 4, week 3, programming assignment 1: non-max suppression and multiple classes Convolutional Neural Networks	5	727	August 6, 2022
Non_max_suppression and the argument of classes Convolutional Neural Networks	1	559	March 7, 2023

Non_max_suppression() over all classes?

Related topics