Some question about Week 2 assignment

Hi,
I have some questions about this assignment.
(1) For the training result, I got loss=0.0005388551 in the first 100 batch. But when I take it to the test in Exercise, I got all [0,5,10] False. I trained it one more run (extra 100 batch) to get loss=0.00035108725 but still got all False. Since the loss is small, I think it is probably overfitting. Do we need to augment the data or better accuracy? But when I submit my result, I got passed. Wonder why that is.
(2) In the 5 training cases, all of them have a zombie, so classifier result are all 1. How does the model learn to classify a 0 case?
(3) For classifier binaryentropy loss and box mse loss, they are different measurements. How do we know they are at the same scale? Should weights be added to adjust them?
(4) If I want to save the model and weights, how can I do it? I tried to use model.save() and model.save_weight(), but it returns “SSDMetaArch’ object has no attribute ‘save_weights’”. Even model.summary() didn’t work.
(5) In the “Make a Prediction” section, the output of preprocess() has to be converted through tf.concat, but in the code in Exercise, it can be directly fed into predict(), why?

1 Like

Hello,

You should ponder over these questions yourself too and maybe you need to go through the materials of the week once again but here is a brief summary to your points.

  1. If you passed the assignment test that means you are fulfilling the passing criteria, if you fail recognizing a test image sometimes it happens your classifier is not 100% accurate. It might be overfitting you can check loss and accuracy on validation/test.

  2. The classifier will learn when the zombie is there, when its not the particular features wont be present so it can not find them hence negative.

  3. As far as I remember they are different branches of the network so 2 different outputs, they are scaled using their respective loss functions and loops no need to worry further on.

  4. Find in github the “SSDMetaArch’ model and get the relevant methods/functions from it.

  5. it is probably because the model being built upon has already a preprocessing function in which it is doing also the concat inside it.