The YOLO loss function is complex. It computes loss derived from the bounding box coordinates, misclassification, and the object detection truth table. All those components get weighted and combined into a single loss value used by the optimizer, which unfortunately is the only default output of training iterations.
In order to better understand what was contributing to that loss value, I used TensorBoard to collect and report on the various components. First, here are the results:
The classification loss is 0! OK, I only have 1 type of object labelled in this data (cars). Softmax can’t screw that up even if it wanted to.
predicted_class_probs = K.softmax(predicted[...,5:] #predicts class(es)
truth_class_probs = K.softmax(truth[0:num_images,:,:,:,5:]) #GT class(es)
classification_loss = classification_weights * K.square(matching_classes - predicted_class_probs). #vectorized
classification_loss_sum = K.sum(classification_loss). #single val for this training batch
Objects Confidence Loss
No Objects Confidence Loss
Total Confidence Loss
no_objects_loss = no_object_weights * K.square(sigmoid0 - predicted_presence)
objects_loss = has_object_weights * K.square(sigmoid1 - predicted_presence)
confidence_loss = objects_loss + no_objects_loss. #vectorized
confidence_loss_sum = K.sum(confidence_loss). #single val for this training batch
Coordinates Loss
coordinates_loss = coordinates_weights * K.square(truth_boxes - predicted_boxes). #vectorized
coordinates_loss_sum = K.sum(coordinates_loss) #single val for this training batch
Total loss
total_loss = 0.5 * (confidence_loss_sum + classification_loss_sum + coordinates_loss_sum). #total loss per batch. TF rolls this up per epoch automagically
The TensorBoard code to produce these is fairly straigtforward.
%load_ext tensorboard
#TensorBoard housekeeping
log_dir = './logs/' + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(log_dir + '/metrics')
file_writer.set_as_default()
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
#train model
history = model.fit(x_train,
y_train,
batch_size = TRAINING_BATCH_SIZE,
epochs=20,
callbacks=[CustomTrainingCallbacks(),
tensorboard_callback])
…
#inside the loss function
tf.summary.scalar('confidence_loss_sum', data=confidence_loss_sum, step=self.step)
tf.summary.scalar('classification_loss_sum', data=classification_loss_sum, step=self.step)
tf.summary.scalar('coordinates_loss_sum', data=coordinates_loss_sum, step=self.step)
tf.summary.scalar('no_objects_loss_sum', data=no_objects_loss_sum, step=self.step)
tf.summary.scalar('objects_loss_sum', data=objects_loss_sum, step=self.step)
tf.summary.scalar('coordinates_loss_sum', data=coordinates_loss_sum, step=self.step)
I’ll use this visualization to examine why confidence and no_objects seem to be responding to training, but coordinates and objects loss are not. Is there a bug in the loss function? Can I impact that with different weights or hyperparameters? Is it different for different training sets? etc