Really poor loss - help with debugging

Hello, all. I passed all the tests from exercise 1 to 9. When I built the training steps for exercise 10, it runs fine but the loss is terribly large. I reviewed my previous exercises and I am having difficulties to figure out the my mistakes. May I ask for some suggestions? where should I look into? what might I do wrong? Any suggestions would help. I appreciate your help. Thanks.

Start fine-tuning!
batch 0 of 100, loss=9732423.0
batch 10 of 100, loss=9670611.0
batch 20 of 100, loss=9608616.0
batch 30 of 100, loss=10025403.0
batch 40 of 100, loss=9175880.0
batch 50 of 100, loss=9878275.0
batch 60 of 100, loss=9364504.0
batch 70 of 100, loss=9010551.0
batch 80 of 100, loss=9743804.0
batch 90 of 100, loss=9183414.0
Done fine-tuning!

@qixuanhou
Since you passed all the tests till exercise 9, I can only suggest checking your train_step_fn in exercise 10 based on the information you have provided.

Thanks for your reply. For exercise 6.1, the expected output has β€œ0x7fefac014710” at the end. My output has the same wording, but the hex is different from β€œ0x7fefac014710”. Do you think that’s an issue?

'_base_tower_layers_for_heads': DictWrapper({'box_encodings': ListWrapper([]), 'class_predictions_with_background': ListWrapper([])}),'_box_prediction_head': <object_detection.predictors.heads.keras_box_head.WeightSharedConvolutionalBoxHead at 0x7fefac014710>, ... 

Getting a different hex should not be an issue.

I had the same issue of a high loss value that barely decreased. Although I’m sure you have solved this issue by now, for anyone reading this forum at a later date, consider reviewing Exercises 6.2 & 6.3: Restore the checkpoint.
Although no error will be given, incorrect code in these cells will affect the output of Exercise 10.

It is also possible – if you annotated the predicting bboxes yourself (option 1, Exercise 2) – that the bboxes are not annotated well.

2 Likes