Course 3 - Week 2 - Zombie assignment

Aurelien_Labonne · September 1, 2022, 2:53pm

At Exercice 6.1, when running vars(tmp_box_predictor_checkpoint) I get the following output

{'_save_counter': None,
 '_save_assign_op': None,
 '_self_setattr_tracking': True,
 '_self_unconditional_checkpoint_dependencies': [TrackableReference(name=_base_tower_layers_for_heads, ref={'box_encodings': ListWrapper([]), 'class_predictions_with_background': ListWrapper([])}),
  TrackableReference(name=_box_prediction_head, ref=<object_detection.predictors.heads.keras_box_head.WeightSharedConvolutionalBoxHead object at 0x7f528025c3d0>)],
 '_self_unconditional_dependency_names': {'_base_tower_layers_for_heads': {'box_encodings': ListWrapper([]),
   'class_predictions_with_background': ListWrapper([])},
  '_box_prediction_head': <object_detection.predictors.heads.keras_box_head.WeightSharedConvolutionalBoxHead at 0x7f528025c3d0>},
 '_self_unconditional_deferred_dependencies': {},
 '_self_update_uid': -1,
 '_self_name_based_restores': set(),
 '_self_saveable_object_factories': {},
 '_base_tower_layers_for_heads': {'box_encodings': ListWrapper([]),
  'class_predictions_with_background': ListWrapper([])},
 '_box_prediction_head': <object_detection.predictors.heads.keras_box_head.WeightSharedConvolutionalBoxHead at 0x7f528025c3d0>,
 '_saver': <tensorflow.python.training.tracking.util.TrackableSaver at 0x7f5288fc7590>,
 '_attached_dependencies': None}

instead of getting

'_base_tower_layers_for_heads': DictWrapper({'box_encodings': ListWrapper([]), 'class_predictions_with_background': ListWrapper([])}),
'_box_prediction_head': <object_detection.predictors.heads.keras_box_head.WeightSharedConvolutionalBoxHead at 0x7fefac014710>,

which I believe is the reason why I can’t calculate the loss function of exercice 9

# Calculate the loss after you've provided the ground truth 
losses_dict = detection_model.loss(prediction_dict, true_shape_tensor)

and I get the following error message

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-113-36f83e7403b4> in <module>
      1 # Calculate the loss after you've provided the ground truth
----> 2 losses_dict = detection_model.loss(prediction_dict, true_shape_tensor)
      3 
      4 # View the loss dictionary
      5 losses_dict = detection_model.loss(prediction_dict, true_shape_tensor)

4 frames
/usr/local/lib/python3.7/dist-packages/object_detection/utils/shape_utils.py in assert_shape_equal(shape_a, shape_b)
    319       all(isinstance(dim, int) for dim in shape_b)):
    320     if shape_a != shape_b:
--> 321       raise ValueError('Unequal shapes {}, {}'.format(shape_a, shape_b))
    322     else: return tf.no_op()
    323   else:

ValueError: Unequal shapes [2], [91]

Does anyone would have any idea what am I doing wrong?

Thanks a lot

paulinpaloalto · September 1, 2022, 3:12pm

Please note that there are no programming assignments in DLS Course 3, so I’m guessing you filed this under a different category than you intended. You can use the little “edit pencil” on the title to move it to the actual specialization in question, which will improve the chances that someone with knowledge of that specialization will notice and respond.

Wendy · September 3, 2022, 12:25am

@Aurelien_Labonne,

First, your Exercise 6.1 is fine. Your list of variables does include the two mentioned, (_base_tower_layers_for_heads and _box_prediction_head), although the data shown in the Expected Output seems to be out-of-date now. _box_prediction_head is the same, but base_tower_layers_for_heads is stored slightly differently now, although you can see that the main data is the same - the dictionary, {'box_encodings': ListWrapper([]), 'class_predictions_with_background': ListWrapper([])}. I tried running my old code for this exercise and my variables look like yours now, but it still runs fine for Exercise 9.1. I’ll post a request to the developers to update the “Expected Output” for this part so others aren’t thrown off by it.

Your problem in Exercise 9.1 is a mystery. A couple of things I noticed:

The shapes in the error [2] and [91] look familiar: 2 = num_classes+1 for our new Zombie model, and 91 = num_classes+1 for the original RetinaNet model we started from
You are not the first to see this problem, see this post: ValueError: Unequal shapes [2], [91]

I suspect the +1 is for an extra class for “background” used internally by the model as hinted by this comment in section 9.1:

        3) class_predictions_with_background: 3-D float tensor of shape
          [batch_size, num_anchors, num_classes+1] containing class predictions

It seems like somehow you have something that is using the new # of classes, but something else left behind that is still using the old # of classes and they are clashing when you call detection_model.loss(), but I don’t have a great guess where the issue is.

My strongest suspicion is that your code is right, but that there is some cell that you accidentally didn’t run which led to the old # of classes being left in place somewhere. This would be consistent with the lack of info about a solution in the other posting about this problem, as well as the comment from the mentor that the student’s code worked fine when the mentor tried it. So… my first suggestion is to try loading your assignment again and carefully re-run the cells in order (skipping the “Option 1” cells in Exercise 2) and see if that works for you.

If it still doesn’t work, you can try printing prediction_dict['class_predictions_with_background'].shape before calling detection_model.loss() in Exercise 9 to make sure the last parameter is 2 (for num_classes + 1). Then poke around to see if you can find anywhere else you notice in your code where the old class prediction layers or num_classes might have been left in the model somehow.

I’ll also ask on the other thread to see if anyone else ever figured this out.

If you’re still stuck after checking these things, feel free to DM me a copy of your .ipynb and I’ll see what I can find

Aurelien_Labonne · September 3, 2022, 2:50am

Dear Wendy,

thanks a lot for your answer. Based on your feedback, I decided to start from scratch and simply copying my code to a new colab notebook and made sure to run every cells. And voila, now it works.

I was focusing too much on the 6.1 output instead of trying to understand the error message on 9.1

Thanks again for your help

Aurelien

ahmadhatahet · September 29, 2022, 12:33pm

I am facing the same issue. However, I think that the variable “gt_classes_one_hot_tensors” is what causing the error.

Here we have the expected error:

and Here is where the “ValueError: Unequal shapes [2], [91]” appear

between them, the only cell is the following:

Wendy · September 30, 2022, 12:26am

@ahmadhatahet,
Just because there is only one cell between the first cell you list and the one with the error, that doesn’t necessarily mean that the cell in the middle is causing the problem.

Since the first cell is demonstrating that you get an error calling detection_model.loss() before it is initialized, it’s possible that the middle cell just fixes that first issue and that lets you run further, which brings you to the next error, which could have been there all along.

But, I agree that it looks like the error you’re seeing in the “Calculate the loss after you’ve provided the ground truth” cell is related to the ground truth values in that middle cell. At least in the stack trace you pasted here, you aren’t seeing the “ValueError: Unequal shapes [2], [91]” error, but instead a “Groundtruth tensor boxes has not been provided” error. If that’s still the error you are seeing, then going back to check what you have for gt_box_tensors sounds like a good place to start debugging.

ahmadhatahet · September 30, 2022, 7:33am

Thank you for your reply.
I totaly agree, however, whenever I debug, I run into the error “ValueError: Unequal shapes [2], [91]” more often than the “Groundtruth” one.

I will continue to debug, and hopefully solve it.
Thanks

avielschory226 · December 15, 2022, 3:51pm

Hi Aurlien,

I am facing this issue too and copying the code to a new colab notebook didn’t solve it. didn’t you change something?

Aurelien_Labonne · December 16, 2022, 12:38am

Hi @avielschory226 ,

As mentioned in my previous answer. No, I literally created a new Colab, ran every cell one by one and it worked. Are you sure it’s the same issue?

avielschory226 · December 16, 2022, 9:56am

Same error at exercise 9.1

avielschory226 · December 18, 2022, 5:10am

Finally I found the cause- I didn’t change the default num_classes…