Data preparation for multiple objects in a single image

Engit · May 17, 2022, 5:08pm

Hello
I have a question regarding the third lab “Image Classification and Object localization”. What if multiple objects were in the images so that you could have one or more than one object. So the issue I was facing was when I was making matrices of classes let’s say I have two categories so my class matrix would be [ [0], [0,1], [ 1,1] …] 0 for cats and 1 for dogs but the matrices are unequal because there can be one cat in an image and in another it can be two or 4, etc. As the number of objects in the images are not fixed. The same goes for the bounding boxes also.

So, how should I organize the labels and the bounding box matrices to feed to the model?

balaji.ambresh · May 17, 2022, 7:32pm

Have you seen yolo ?
The 4th course in deep learning specialization covers this topic as well.

Engit · May 17, 2022, 10:02pm

Ok, I will check it out. Thanks for your response.

ai_curious · May 19, 2022, 2:47pm

It is important to recognize that the shape of the labels y provided to the model during training and the output produced by the model \hat{y} must match. If your model can only output a single object prediction, there is no point to input multiple objects’ data.

Multiple object prediction either requires a single object output from a model run multiple times on different subregions of the input image, or multiple objects output from a model run a single time on the entire input image. YOLO, which gets its name from only looking at the input image once, uses the latter approach. Like all things engineering there is no free lunch - each approach has benefits and costs and picking which one to use is business outcome driven ie you optimize on throughput or accuracy or memory footprint or training complexity etc.

Engit · May 19, 2022, 8:32pm

Thanks for your response.
I will think about limiting the maximum number of objects that the model can predict looking once at the image.

Topic		Replies	Views
https://www.coursera.org/learn/convolutional-neural-networks/lecture/fF3O0/yolo-algorithm Convolutional Neural Networks	5	695	March 12, 2023
Output layer for detecting same object (a bounding box) multiple times in an image AI Discussions	6	370	March 4, 2023
Detecting Multiple Objects using YOLO - Grid Cells plus Anchor Boxes Convolutional Neural Networks	6	1552	March 16, 2024
Object detection NN Convolutional Neural Networks	3	482	May 30, 2023
Course4 Week3: Understanding YOLO Algorithm Convolutional Neural Networks	5	816	March 18, 2025

Data preparation for multiple objects in a single image

Related topics