W3_A1 YOLO boxes = tf.boolean_mask(boxes, filtering_mask)

PLEASE HELP ME IDENTIFY WHAT WNET WRONG WITH BOXES HERE

def yolo_filter_boxes(boxes, box_confidence, box_class_probs, threshold = .6):
    """Filters YOLO boxes by thresholding on object and class confidence.
    
    Arguments:
        boxes -- tensor of shape (19, 19, 5, 4)
        box_confidence -- tensor of shape (19, 19, 5, 1)
        box_class_probs -- tensor of shape (19, 19, 5, 80)
        threshold -- real value, if [ highest class probability score < threshold],
                     then get rid of the corresponding box

    Returns:
        scores -- tensor of shape (None,), containing the class probability score for selected boxes
        boxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxes
        classes -- tensor of shape (None,), containing the index of the class detected by the selected boxes

    Note: "None" is here because you don't know the exact number of selected boxes, as it depends on the threshold. 
    For example, the actual output size of scores would be (10,) if there are 10 boxes.
    """
    
    ### START CODE HERE
# mentor edit: code removed
    ### END CODE HERE
    
    return scores, boxes, classes

ValueError Traceback (most recent call last)
in
----> 1 out_scores, out_boxes, out_classes = predict(“test.jpg”)

in predict(image_file)
20 yolo_outputs = yolo_head(yolo_model_outputs, anchors, len(class_names))
21
—> 22 out_scores, out_boxes, out_classes = yolo_eval(yolo_outputs, [image.size[1], image.size[0]], 10, 0.3, 0.5)
23
24 # Print predictions info

in yolo_eval(yolo_outputs, image_shape, max_boxes, score_threshold, iou_threshold)
31
32 # Use one of the functions you’ve implemented to perform Score-filtering with a threshold of score_threshold (≈1 line)
—> 33 scores, boxes, classes = yolo_filter_boxes(boxes, box_confidence, box_class_probs, threshold = score_threshold)
34
35 # Scale boxes back to original image shape.

in yolo_filter_boxes(boxes, box_confidence, box_class_probs, threshold)
40 ## (≈ 3 lines)
41 scores = tf.boolean_mask(box_class_scores, filtering_mask)
—> 42 boxes = tf.boolean_mask(boxes, filtering_mask)
43 classes = tf.boolean_mask(box_classes, filtering_mask)
44 ### END CODE HERE

/opt/conda/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 “”“Call target, and fall back on dispatchers if there is a TypeError.”""
200 try:
→ 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py in boolean_mask_v2(tensor, mask, axis, name)
1801 ```
1802 “”"
→ 1803 return boolean_mask(tensor, mask, name, axis)
1804
1805

/opt/conda/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 “”“Call target, and fall back on dispatchers if there is a TypeError.”""
200 try:
→ 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py in boolean_mask(tensor, mask, name, axis)
1728 if axis_value is not None:
1729 axis = axis_value
→ 1730 shape_tensor[axis:axis + ndims_mask].assert_is_compatible_with(shape_mask)
1731
1732 leading_size = gen_math_ops.prod(shape(tensor)[axis:axis + ndims_mask], [0])

/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py in assert_is_compatible_with(self, other)
1132 “”"
1133 if not self.is_compatible_with(other):
→ 1134 raise ValueError("Shapes %s and s are incompatible" (self, other))
1135
1136 def most_specific_compatible_shape(self, other):

ValueError: Shapes (1, 19, 19, 5) and (1, 19, 19, 80) are incompatible

Have you passed the tests for yolo_eval and all other methods before you run the code on test.jpg?

I added some print statements to my code to show the shape and type of everything. Here’s what I get from that test case:

boxes.shape (19, 19, 5, 4)
boxes.dtype <dtype: 'float32'>
box_scores.shape (19, 19, 5, 80)
box_scores.dtype <dtype: 'float32'>
box_classes.shape (19, 19, 5)
box_classes.dtype <dtype: 'int64'>
box_class_scores.shape (19, 19, 5)
box_class_scores.dtype <dtype: 'float32'>
filtering_mask.shape (19, 19, 5)
filtering_mask.dtype <dtype: 'bool'>
scores[2] = 9.270486
boxes[2] = [ 4.6399336  3.2303846  4.431282  -2.202031 ]
classes[2] = 8
scores.shape = (1789,)
boxes.shape = (1789, 4)
classes.shape = (1789,)
 All tests passed!

You’ve written some dimensions in the comments, but it’s worth adding the print statements to make sure what you think is happening is the same as what is actually happening. :nerd_face:

1 Like

yes, and I got 100/100 from the grader, but I still could not compile the final block of code:

out_scores, out_boxes, out_classes = predict("test.jpg")

so is there anything mistaken from my code? this block compiles but it fails when later called from the final ungraded blocks of code.

out_scores, out_boxes, out_classes = predict("test.jpg")

Please show us the actual exception trace that you are getting when you run that code.

I am very much confused… please see my notebook, thank!

{moderator edit - notebook attachment removed}

Hello,

The output of yolo_model is a (m, 19, 19, 5, 85) tensor. You have hardcoded the axis parameter, which is never recommended. Better to always use -1 if it is the last dimension you are working with. Then your code will work for batches as well. However, I will report this to course staff since the instructions could be clearer.

2 Likes

This is super helpful Paul. Mine is falling apart a bit.

box_scores shape (19, 19, 5, 80) (19, 19, 5, 80)
box_scores type <dtype: ‘float32’>
box_classes shape (19, 19, 5) (19, 19, 5)
box_classes type <dtype: ‘int64’>
box_class_scores shape (19, 19, 5) (19, 19)
box_class_scores type <dtype: ‘int64’>

I used; tf.math.reduce_max(box_classes, axis = -1) and changing the keepdims parameter to True gives me a shape of (19, 19, 1). More concerning is the type = int64 rather than float. Any thoughts?

I think the issue is that you don’t do reduce_max on the “classes”, right? Those are just labels (index values) identifying which type of object is in the bounding box. You do reduce_max on the scores in order to find the score that corresponds to the class that was selected by the previous argmax call.

Yes, that was it. Thanks!!!

1 Like