Instrumenting YOLO Training with TensorBoard

ai_curious · August 15, 2021, 11:30am

The YOLO loss function is complex. It computes loss derived from the bounding box coordinates, misclassification, and the object detection truth table. All those components get weighted and combined into a single loss value used by the optimizer, which unfortunately is the only default output of training iterations.

In order to better understand what was contributing to that loss value, I used TensorBoard to collect and report on the various components. First, here are the results:

The classification loss is 0! OK, I only have 1 type of object labelled in this data (cars). Softmax can’t screw that up even if it wanted to.

    predicted_class_probs = K.softmax(predicted[...,5:] #predicts class(es)
    truth_class_probs = K.softmax(truth[0:num_images,:,:,:,5:])  #GT class(es)
    classification_loss = classification_weights * K.square(matching_classes - predicted_class_probs). #vectorized
    classification_loss_sum = K.sum(classification_loss). #single val for this training batch

TensorBoard YOLO Object Loss
Objects Confidence Loss

No Objects Confidence Loss

Total Confidence Loss

    no_objects_loss  = no_object_weights  * K.square(sigmoid0 - predicted_presence)
    objects_loss     = has_object_weights * K.square(sigmoid1 - predicted_presence)
    confidence_loss = objects_loss + no_objects_loss. #vectorized
    confidence_loss_sum = K.sum(confidence_loss). #single val for this training batch

TensorBoard YOLO Coordinates Loss
Coordinates Loss

    coordinates_loss = coordinates_weights * K.square(truth_boxes - predicted_boxes). #vectorized
    coordinates_loss_sum = K.sum(coordinates_loss) #single val for this training batch

Total loss

        total_loss = 0.5 * (confidence_loss_sum + classification_loss_sum + coordinates_loss_sum). #total loss per batch.  TF rolls this up per epoch automagically

The TensorBoard code to produce these is fairly straigtforward.

%load_ext tensorboard

   #TensorBoard housekeeping
log_dir = './logs/' + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(log_dir + '/metrics')
file_writer.set_as_default()

tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

   #train model
history = model.fit(x_train, 
                y_train, 
                batch_size = TRAINING_BATCH_SIZE, 
                epochs=20,
                callbacks=[CustomTrainingCallbacks(),
                           tensorboard_callback])

…

   #inside the loss function
    tf.summary.scalar('confidence_loss_sum', data=confidence_loss_sum, step=self.step)
    tf.summary.scalar('classification_loss_sum', data=classification_loss_sum, step=self.step)
    tf.summary.scalar('coordinates_loss_sum', data=coordinates_loss_sum, step=self.step)
    tf.summary.scalar('no_objects_loss_sum', data=no_objects_loss_sum, step=self.step)
    tf.summary.scalar('objects_loss_sum', data=objects_loss_sum, step=self.step)
    tf.summary.scalar('coordinates_loss_sum', data=coordinates_loss_sum, step=self.step)

I’ll use this visualization to examine why confidence and no_objects seem to be responding to training, but coordinates and objects loss are not. Is there a bug in the loss function? Can I impact that with different weights or hyperparameters? Is it different for different training sets? etc

Rashmi · June 10, 2022, 7:48am

Hi ai_curious,

Did you find the solution to this question. If so, please share at this platform so that others can have an idea on how you have worked out on that part?

Thanks!

Topic		Replies	Views
How is the training done from 1919425 as labels are just class and boxes Convolutional Neural Networks coursera-platform	1	529	September 15, 2021
Anyone training their own YOLO CNN ? What do your losses look like? Convolutional Neural Networks coursera-platform	4	564	August 25, 2021
YoloV5 Output/ Model Not Learning with Custom Data Convolutional Neural Networks coursera-platform	2	498	May 28, 2023
Is YOLO a regression or classification algorithm? Convolutional Neural Networks coursera-platform	11	1068	February 23, 2023
Defining a custom class for YOLO loss Convolutional Neural Networks coursera-platform	8	918	February 1, 2022

Instrumenting YOLO Training with TensorBoard

Related topics