I’ve been trying to do the Zombie Detector assignment, and every cell produces output matching the expected output, until I try to run the training loop. Here’s what happens when I do:
Start fine-tuning!
batch 0 of 100, loss=1.1923661
batch 10 of 100, loss=6234.3574
batch 20 of 100, loss=23406.596
batch 30 of 100, loss=29222.49
batch 40 of 100, loss=31061.418
batch 50 of 100, loss=31496.303
batch 60 of 100, loss=31441.637
batch 70 of 100, loss=31216.28
batch 80 of 100, loss=30931.404
batch 90 of 100, loss=30625.77
Done fine-tuning!
What could be causing this? Also: Is the training step function supposed to have the “model.provide_groundtruth” command inside “with tf.GradientTape() as tape”, as it is in the tutorial collab notebook (though not mentioned in the instructions)? When I put it in, I get the crazy numbers as above. But without it (with everything else there as per instructions), I get the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-202-37e437a735b4> in <cell line: 3>()
15
16 # Training step (forward pass + backwards pass)
---> 17 total_loss = train_step_fn(image_tensors,
18 gt_boxes_list,
19 gt_classes_list,
1 frames
/usr/local/lib/python3.9/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
1145 except Exception as e: # pylint:disable=broad-except
1146 if hasattr(e, "ag_error_metadata"):
-> 1147 raise e.ag_error_metadata.to_exception(e)
1148 else:
1149 raise
ValueError: in user code:
File "<ipython-input-163-f871aa37b683>", line 45, in train_step_fn *
losses_dict = model.loss(prediction_dict, true_shape_tensor)
File "/usr/local/lib/python3.9/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 876, in loss *
location_losses = self._localization_loss(
File "/usr/local/lib/python3.9/dist-packages/object_detection/core/losses.py", line 78, in __call__ *
target_tensor = tf.where(tf.is_nan(target_tensor),
ValueError: Shapes must be equal rank, but are 3 and 1 for '{{node Loss/Loss/Select}} = Select[T=DT_FLOAT](Loss/Loss/IsNan, concat_1, Loss/stack_2)' with input shapes: [0], [5,51150,4], [0].
I’ve checked everything else, and it matches what the instructions specify. What could be wrong here? Thank you.