Programming Assignment: Zombie Detector Training not working

Hi I am having trouble with the Programming Assignment: Zombie Detector

I am getting the following after training.

Start fine-tuning!
batch 0 of 100, loss=1.827307
batch 10 of 100, loss=1.1855664
batch 20 of 100, loss=0.8876642
batch 30 of 100, loss=0.99491
batch 40 of 100, loss=0.95344615
batch 50 of 100, loss=0.864509
batch 60 of 100, loss=1.0729249
batch 70 of 100, loss=1.0671098
batch 80 of 100, loss=0.96668077
batch 90 of 100, loss=0.8572345
Done fine-tuning!

I tried verifying I was loading my checkpoints correctly and it seems like I am. I am not sure what else to check. How do I get help with this?

I am randomly receiving value errors like the ones below but I am assuming it’s some kind of Colab issues because I can rerun the cells and it will work

ValueError: Received incompatible tensor with shape (1, 1, 2048, 512) when attempting to restore variable with shape (3, 3, 256, 256) and name conv4_block5_2_conv/kernel:0.

1 Like

Are passing the assignment or you are stuck, this is a complex assignment, can you check posts related, there are others on this page of the forum!

1 Like

I am stuck, I have looked through other posts. The only thing I have really found is that I am most likely not loading the checkpoint correctly but there is not much help on how to correctly load the checkpoint.

The assignment has no checks that the checkpoint is actually loaded and I am following this notebook to try to figure out what is wrong but everything appears correct: models/research/object_detection/colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb at master · tensorflow/models · GitHub

I believe that I am not correctly loading the checkpoint because I keep getting errors in section 6.3 but I don’t know how to figure out where the issue is as according to the referenced notebook above everything is correct

1 Like

Can you share the error image you are getting?

1 Like

I believe I have fixed the problem. I think my issue was that the path’s to the model checkpoints were not set correctly.

1 Like