YOLO- Training dataset

Yasmeen_Asaad_Azazi · December 19, 2024, 4:05pm

Week #2 #YOLO
Hello everyone
Wishing you happy December and happy holiday ,
I have a question about the training dataset for YOLO algorithm:

In case the classification problem, the dataset contains images and y label for each image.
So, for YOLO we have 2 problems Classification and Localization(regression), right?
For classifying objects(C1, C2, …), and predicting the boundary box for each obj (Px, Py, Ph, Pw)
My first question is i don’t understand how the dataset looks like for YOLO Alg.,
I mean whether the training dataset containing the classification/localization for each cell in the grid or it just contains the y vector for the objects and it’s localization info, and if so how (Px, Py) be compared with predicted Px, Py -because all cells have the same X and Y range of values- ?

The Second question is depending on if the training dataset contains the y vector for each cell in the grid, if so, that means the testing set should have the same grid size as training grid size?

Thank you for reading my questions

Attila_Ambrus · January 15, 2025, 10:48pm

The dataset includes a txt file for each image, but this is implementation dependent. The file contains rows for the image. Each line represents an object in that image. The first number is the line number of the object’s class. The next 2 numbers are the center of the rectangle surrounding the object, and the last two numbers are the width and height of the rectangle. These data are all normalized between 0 and 1. So x and the width of the box are divided by the width of the whole image, and y and the height are divided by the height of the whole image.

Because of the normalisation, the image size of the teaching and test sets need not be the same. Even within a training set, images do not have to be the same size.

Deepti_Prasad · January 17, 2025, 7:07pm

in addition to the response you already got, here is another link which explains your doubt by @ai_curious

Use search tool YOLO, you will find many discussions regarding yolo algorithm.

Hope this helps, feel free to ask if any further questions.

ai_curious · January 17, 2025, 9:07pm

One point to clarify is that the raw data is generally unaware of the computer vision task and algorithm that will consume it. So, typically, your raw data is just a collection of images and associated label files, as described above. The labels contain the location and class of the objects. Typically the raw data contains no information about ‘grids’, which are algorithm-specific.

You must preprocess raw data to prepare it for use in YOLO. That process is described in the linked (and other existing) threads. Basically mapping the raw data onto a multi-dimensional structure with an identical shape to the output of the YOLO neural network. This is then the ground truth matrix used in the cost function. Hope this helps

Topic		Replies	Views
Where is the dataset of YOLO algorithm? Convolutional Neural Networks coursera-platform	8	706	July 4, 2022
Training data for YOLO Convolutional Neural Networks week-module-3 , coursera-platform	4	373	January 15, 2024
Lessons learned training YOLO from scratch on custom images Convolutional Neural Networks coursera-platform	8	861	September 23, 2021
Course4 Week3: Understanding YOLO Algorithm Convolutional Neural Networks coursera-platform	5	820	March 18, 2025
https://www.coursera.org/learn/convolutional-neural-networks/lecture/fF3O0/yolo-algorithm Convolutional Neural Networks coursera-platform	5	702	March 12, 2023

YOLO- Training dataset

Related topics