In the Week 1 “Image Classification and Object Localization” lab, the code as given does not find bounding boxes
Below is some output from the model training. As you can see, the classification_accuracy does improve until Epoch 8, when it collapses back to 10%. The bounding_box_mse INCREASES at every epoch (except the last).
If I look at actual values for the bounding_box the model finds then they are all large negative values eg
[-28.9, -36.1, -25.4, -28.6]
whereas real bounding boxes have values between 0 and 1 like
[0.53, 0.67, 0.72, 0.84]
I wondered if the problem is that the bounding box output layer needed to have sigmoid activation to keep its values between 0 and 1 but changing this
bounding_box_regression_output = tf.keras.layers.Dense(units = 4, activation='sigmoid', name = 'bounding_box')(inputs)
means that the model now sets all the bounding boxes to [1, 1, 1, 1]
There is another minor error - the code as originally written was “units =‘4’” but units is an integer not a string so the code didn’t run until I changed this. So maybe this is due to a tensorflow version change which has changed units from str to int and also changed the default value for something which is now necessary to state explicitly?
Epoch 1/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 281s 295ms/step - bounding_box_loss: 2.6578 - bounding_box_mse: 11.3629 - classification_accuracy: 0.1051 - classification_loss: 0.0900 - loss: 2.7478 - val_bounding_box_loss: 2.5046 - val_bounding_box_mse: 216.6360 - val_classification_accuracy: 0.1833 - val_classification_loss: 0.0868 - val_loss: 2.5914
Epoch 2/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 272s 290ms/step - bounding_box_loss: 2.5251 - bounding_box_mse: 295.2394 - classification_accuracy: 0.2509 - classification_loss: 0.0819 - loss: 2.6070 - val_bounding_box_loss: 2.5351 - val_bounding_box_mse: 666.4528 - val_classification_accuracy: 0.4712 - val_classification_loss: 0.0639 - val_loss: 2.5989
Epoch 3/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 271s 290ms/step - bounding_box_loss: 2.5235 - bounding_box_mse: 907.3889 - classification_accuracy: 0.5426 - classification_loss: 0.0583 - loss: 2.5818 - val_bounding_box_loss: 2.5287 - val_bounding_box_mse: 2344.5562 - val_classification_accuracy: 0.6423 - val_classification_loss: 0.0472 - val_loss: 2.5760
Epoch 4/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 281s 300ms/step - bounding_box_loss: 2.5311 - bounding_box_mse: 2069.8621 - classification_accuracy: 0.7006 - classification_loss: 0.0411 - loss: 2.5722 - val_bounding_box_loss: 2.5203 - val_bounding_box_mse: 2233.9570 - val_classification_accuracy: 0.8561 - val_classification_loss: 0.0221 - val_loss: 2.5424
Epoch 5/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 270s 288ms/step - bounding_box_loss: 2.5198 - bounding_box_mse: 2399.3384 - classification_accuracy: 0.8457 - classification_loss: 0.0230 - loss: 2.5429 - val_bounding_box_loss: 2.5179 - val_bounding_box_mse: 2855.4448 - val_classification_accuracy: 0.9006 - val_classification_loss: 0.0153 - val_loss: 2.5332
Epoch 6/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 271s 289ms/step - bounding_box_loss: 2.5241 - bounding_box_mse: 2879.4514 - classification_accuracy: 0.8893 - classification_loss: 0.0168 - loss: 2.5409 - val_bounding_box_loss: 2.5349 - val_bounding_box_mse: 3007.2397 - val_classification_accuracy: 0.9252 - val_classification_loss: 0.0116 - val_loss: 2.5465
Epoch 7/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 271s 289ms/step - bounding_box_loss: 2.5245 - bounding_box_mse: 2894.3176 - classification_accuracy: 0.9136 - classification_loss: 0.0131 - loss: 2.5377 - val_bounding_box_loss: 2.5332 - val_bounding_box_mse: 3362.8108 - val_classification_accuracy: 0.9301 - val_classification_loss: 0.0106 - val_loss: 2.5438
Epoch 8/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 273s 292ms/step - bounding_box_loss: 2.5291 - bounding_box_mse: 3301.3079 - classification_accuracy: 0.9193 - classification_loss: 0.0123 - loss: 2.5414 - val_bounding_box_loss: 2.6231 - val_bounding_box_mse: 22641.6094 - val_classification_accuracy: 0.1009 - val_classification_loss: 0.1798 - val_loss: 2.8030
Epoch 9/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 281s 300ms/step - bounding_box_loss: 2.5757 - bounding_box_mse: 17880.8242 - classification_accuracy: 0.1003 - classification_loss: 0.1799 - loss: 2.7557 - val_bounding_box_loss: 2.5235 - val_bounding_box_mse: 13066.3057 - val_classification_accuracy: 0.1009 - val_classification_loss: 0.1798 - val_loss: 2.7033
Epoch 10/10
937/937 ━━━━━━━━━━━━━━━━━━━━ 277s 296ms/step - bounding_box_loss: 2.5259 - bounding_box_mse: 12436.9219 - classification_accuracy: 0.0983 - classification_loss: 0.1803 - loss: 2.7063 - val_bounding_box_loss: 2.5161 - val_bounding_box_mse: 13073.5303 - val_classification_accuracy: 0.1009 - val_classification_loss: 0.1798 - val_loss: 2.6959