How to construct a training set for yolo

Fatcar2002 · July 10, 2022, 3:17am

Hi
The training set does not appear to be involved in the exercise.
I’m curious what yolo’s training set looks like?
Do I need to do the labeling manually like u-net?

Such as training wine glasses, smoke shaped borders. What tools do I need to use, do I manually make the borders first? And then go to transfer learning?

thanks

anon57530071 · July 10, 2022, 4:15am

In our exercise, training YOLO is not included, since it is typically time consuming.

I’m curious what yolo’s training set looks like?

YOLO has different versions which take some different inputs. But, basically, there is a YOLO format which consists of “class index” and “bounding box information”. This is a text file, and has the same name as an image file. In parallel, we create a list of classes like “car”, “bicycle”, … for the reference of “class index”.

Do I need to do the labeling manually like u-net?

Basically, we define a bounding box on an image and put annotation, i.e, “class index” in front.
labelImg supports YOLO format directly. Microsoft VoTT (Visual Object Tagging Tool) does not support YOLO format, but generate Pascal VoC (XML). This can be converted into YOLO format, or some derivatives of YOLO directly supports this.

Such as training wine glasses, smoke shaped borders. What tools do I need to use, do I manually make the borders first?

So, first step is to create annotation files with using above tools.
Then, you need to decide which version of YOLO that you use. Don’t use V2. Lots’ of complaints from this community members. Original YOLO is written in C. An original author stopped to enhance, but several researchers/developers enhanced it. V3 was ported to Keras environment, and V5 is now Pytorch version.

And then go to transfer learning?

Yes, you should start with a transfer learning, since it takes significant amount of time to train a model. Basically, YOLO consists of 3 parts as follows.

YOLO backbone
YOLO neck
YOLO head

YOLO backbone is a relatively large Convolutional network to detect objects. The original backbone network is called “Darknet”. Recent works replace this to MobileNet or others as a research project.
YOLO neck is so called "Feature Pyramid Network) to extract objects (boxes) from different layers. As a default, it has 3 layers. And, outputs are extracted from different layers, and merged with the output from the last layer, which is upscaled. This is another convolutional network layer.
YOLO head is to select anchor boxes with non-max suppression, and finalize confidence level and class.

And, there are some options for a transfer learning depending to which part of the above you want to train again. The important thing is a model is trained for 80 classes. If you do not change the number of classes, then, you have multiple options like ‘load weights for a backbone network only’, ‘fine tuning with loading all weights’ and so on. If you want to change the number of classes, even in this case, you can load “weights” for YOLO backbone and neck.
Of course, there is an option to train from scratch. (I think it is quite difficult to get converged.)

I ported V3 into my latest Tensorflow/Keras environment actually. It works fine. Algorithm itself is not so complex, but, it takes time for training. If you just want to touch, Keras version or Pytorch version should be handy. Do not go with V2.

Hope this helps.

Fatcar2002 · July 10, 2022, 5:06am

This answer is fantastic!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
thanks

Topic		Replies	Views
YOLO training set Convolutional Neural Networks coursera-platform	3	562	May 12, 2021
YOLO- Training dataset Convolutional Neural Networks week-module-2 , coursera-platform	3	43	January 17, 2025
Question regarding the dataset of Autonomous_driving_application_Car_detection Convolutional Neural Networks week-module-3 , coursera-platform	2	88	June 22, 2024
How is the training done from 1919425 as labels are just class and boxes Convolutional Neural Networks coursera-platform	1	529	September 15, 2021
Week3 How to prepare the ground truth vector y Convolutional Neural Networks coursera-platform	3	577	June 9, 2022

How to construct a training set for yolo

Related topics