Anyone training their own YOLO CNN ? What do your losses look like?

ai_curious · August 13, 2021, 5:31pm

I’ve spent the last week building out my own implementation of a YOLO v2 CNN to run against the Berkeley Driving Data set I’ve mentioned in other threads. I’ll share a lessons learned in a bit. Seems like it is running now, and doing something, though exactly what I am not sure yet. I instrumented the code using TensorBoard and it is producing this chart for loss per epoch. The numbers feel really large, but can’t find anything to compare against. Have you trained your own from scratch, that is, not starting with pre-trained weights of any sort? What did you see? Thoughts and suggestions welcome.

Happy with the general shape of the curve, but the last ~15 epochs is pretty flat.

ai_curious · August 16, 2021, 12:10pm

I realized after looking at this for a few days is the loss numbers for any CNN training are going to be proportional to the number of predictions being made. My network outputs 19*19*8*6= 17,328 values for each image. With 72 images in a batch, that’s 1.25 million predictions. Doesn’t take much of an error for each to produce some nominally large losses.

ai_curious · August 17, 2021, 2:05pm

sometimes things just go very, very wrong…

jonaslalin · August 18, 2021, 5:20pm

Very inspiring @ai_curious! I haven’t had time to do my own implementation of YOLO yet, but it is definitely on my bucket list

ai_curious · August 25, 2021, 2:13pm

Batch and Epoch losses are proportional to the number of training examples in each as well as the scale of the values being used. If you use softmax for classification, your classification loss will always be order of magnitude 1. Similarly, if you are using sigmoid activation on the object / no object prediction, losses will be order of magnitude 1. Coordinates losses are not as straightforward, though. They depend on whether or not you are predicting location and size directly, or using an activation, such as \sigma, to constrain, as well as whether you are rescaling to image relative coordinates. I have anecdotal evidence to suggest that there is also a difference in loss for different training examples depending on whether my driving images are daytime and clear weather, nighttime, or raining.

My takeaway is that the loss values per batch or epoch are not comparable across projects because they depend on exactly how you are handling data during the training. HTH

Topic		Replies	Views
Instrumenting YOLO Training with TensorBoard Convolutional Neural Networks	1	602	June 10, 2022
YOLO Loss Function Convolutional Neural Networks	1	551	July 14, 2021
Week3: Raw Output from YOLO to get final predictions Convolutional Neural Networks	6	691	May 7, 2023
How is the training done from 1919425 as labels are just class and boxes Convolutional Neural Networks	1	529	September 15, 2021
YOLO- Training dataset Convolutional Neural Networks week-2	3	41	January 17, 2025

Anyone training their own YOLO CNN ? What do your losses look like?

Related topics