Week2_assignment_Zombie Detection: '_box_predictor' not as expected and trainning cannot work

Problem:
In Exercise 10: Define the training step, the result is always that ‘loss’ could not decay as the below:
Start fine-tuning!
batch 0 of 100, loss=1.8186622
batch 10 of 100, loss=1.7925384
batch 20 of 100, loss=1.7437365
batch 30 of 100, loss=1.6895539
batch 40 of 100, loss=1.636056

I went through my codes several times and can only identify the below
difference between my result and the expected values for Exercise 6.2 for "‘_box_predictor’.
Can this be the cause of that problem for Exercise 10?
Besides that, I also read some posts for this assignment but have no clues. Please help to propose how to check for next step. Thanks.

my values:
‘_box_predictor’: <tensorflow.python.checkpoint.checkpoint.Checkpoint at 0x7bed25c5b6d0>,
‘_feature_extractor’: <object_detection.models.ssd_resnet_v1_fpn_keras_feature_extractor.SSDResNet50V1FpnKerasFeatureExtractor at 0x7bed25e1bbe0>,

expected:

'_box_predictor': <tensorflow.python.training.tracking.util.Checkpoint at 0x7fefac044a20>, '_feature_extractor': <object_detection.models.ssd_resnet_v1_fpn_keras_feature_extractor.SSDResNet50V1FpnKerasFeatureExtractor at 0x7fefac0240b8>,

Hi @tazhu

if the below comment doesn’t help you resolve your issue

then let me know

Regards
DP

Hi, DP

Thanks for your reply.

define the path to the .config file for ssd resnet 50 v1 640x640

For exercise 5.1, I have changed the code

FROM
pipeline_config = ‘models/research/object_detection/configs/tf2/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.config’

TO
pipeline_config = ‘/content/models/research/object_detection/configs/tf2/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.config’

And the result for Exercise 10 is more or less the same, loss does not decay as expected:
Start fine-tuning!
batch 0 of 300, loss=1.8185756
batch 10 of 300, loss=1.7925313
batch 20 of 300, loss=1.7438704
batch 30 of 300, loss=1.6898314
batch 40 of 300, loss=1.6364615
batch 50 of 300, loss=1.5859942
batch 60 of 300, loss=1.5396957
batch 70 of 300, loss=1.4990684
batch 80 of 300, loss=1.4658161
batch 90 of 300, loss=1.4410862
batch 100 of 300, loss=1.4245821

And the result for ‘_box_predictor’ in Excercise 6.2 did not change:
‘_box_predictor’: <tensorflow.python.checkpoint.checkpoint.Checkpoint at 0x7bed25c5b6d0>,

I will try to send my codes to you in private message.
Please kindly help to direct me for next step.

Thanks again

kindly review your codes with the pinned comment.

Also use search engine, there are many similar threads related to your issue

Thanks for your reponse.

It seems there is a critical issue in my codes for Excercise 6.3. I changed

FROM

Define a checkpoint that sets model to the temporary model checkpoint

checkpoint = tf.train.Checkpoint(model=tmp_box_predictor_checkpoint)

TO

Define a checkpoint that sets model to the temporary model checkpoint

checkpoint = tf.train.Checkpoint(model=tmp_model_checkpoint)

Now I have the below result for loss. Seems it working.

Start fine-tuning!
batch 0 of 100, loss=1.16671
batch 10 of 100, loss=26.676056
batch 20 of 100, loss=22.802942
batch 30 of 100, loss=5.4472632
batch 40 of 100, loss=0.33777377
batch 50 of 100, loss=0.10817821
batch 60 of 100, loss=0.03005678
batch 70 of 100, loss=0.004617055
batch 80 of 100, loss=0.00070770754
batch 90 of 100, loss=0.0004739061
Done fine-tuning!

1 Like

Kindly remove any codes or assignment link shared here in the post. It is against community guidelines.