Huge underfitting on week 2 assignment

Luis_Filipe · March 4, 2022, 12:57pm

Dear colleagues,

I am getting a huge underfitting on training the zombie detection model. Here is my loss decay using learning rate 0.01:

Start fine-tuning!
batch 0 of 100, loss=1.8433523
batch 10 of 100, loss=1.8154047
batch 20 of 100, loss=1.7628807
batch 30 of 100, loss=1.7041743
batch 40 of 100, loss=1.646369
batch 50 of 100, loss=1.5922079
batch 60 of 100, loss=1.5430906
batch 70 of 100, loss=1.5004694
batch 80 of 100, loss=1.4661503
batch 90 of 100, loss=1.4412313
Done fine-tuning!

By drastically increasing the learning rate to 1 and the number of batchs to around 30k, I could find losses around 0.005 which were not enough to lead to solutions to the assigment. Then, my available time on colab finished. I am really stucked now. Can anyone please help me?

gent.spah · March 4, 2022, 6:15pm

This is a complex assignment with many steps, you are probably doing something wrong upwards which is not as it is supposed to be. Have a look on the forum first if you can find any helpful things about this assignment. Otherwise you should check all the previous steps with care.

Dionisio_Sanchez · March 28, 2023, 1:19pm

@gent.spah @Luis_Filipe

I am encountering the exact same problems. It seems that the algorithm is not learning if the learning rate is set to a reasonable value, as suggested by the assignment. If I run the experiment of increasing it dramatically to 1, I make similar observations as @Luis_Filipe .

I have gone through the assignment from top to bottom for weeks and I can simply not figure out where the error, if any, could be. Did you spot it @Luis_Filipe ?

gent.spah · March 28, 2023, 4:28pm

The learning rate of 1 is too big actually. I would suggest to pay special attentions at the restoration of model checkpoint and similar settings in the model restorations and output layers isolation.

I have also seen some learners have problems with train_step_fn() function due to not following instructions properly.

I would suggest having a look at these.

youssef_ghaoui · November 23, 2023, 8:13am

Have you fixed the problem?

wdeptx · January 5, 2024, 4:43am

I had similar symptoms with loss at about 1.3 to 1.8 and not converging. Turned out to be a misspelling.

I got it working by starting over and re-entering all the Exercise code, this time copy-pasting (eg from the Lecture Notes and hints) wherever possible to avoid misspelling any long variable names.

Then I used Colab diff between the bad version and the good version to understand what happened and sure enough I had a misspelling:

Good (from the lecture notes):
_base_tower_layers_for_heads=detection_model._box_predictor._base_tower_layers_for_heads,

Bad:
_base_tower_layer_for_heads=detection_model._box_predictor._base_tower_layers_for_heads,

Did the edit and then Runtime->Run All and the loss converged as expected.

Topic		Replies	Views
Some serious problems with the assignment C3W2 (Zombie Detector) - underfitting problem Advanced Computer Vision with TensorFlow week-2	22	738	July 17, 2023
Issue with training loss Advanced Computer Vision with TensorFlow week-2	1	423	September 16, 2023
Problem in week 2 assignment of week 3 Advanced Computer Vision with TensorFlow week-2	1	571	May 21, 2023
C3W2 Assignment Zombie Detector Advanced Computer Vision with TensorFlow week-2	6	124	August 6, 2024
Programming Assignment: Zombie Detector Training not working Advanced Computer Vision with TensorFlow week-2	4	368	January 10, 2024

Huge underfitting on week 2 assignment

Related topics