Detecting Multiple Objects using YOLO - Grid Cells plus Anchor Boxes

ai_curious · August 2, 2021, 1:33am

Here is what this looks like in terms of the YOLO v2 model itself. I built the CNN using the 608x608 Berkeley Driving Data image used in the previous thread, a 19x19 grid shape, 8 dimension clusters/anchor boxes, and 1 class (cars only for now). Or S*S*B*(1+4+1) = 19*19*8*6

You can see the 608x608x3 input shape in the input layer, and the 19x19x8x6 shape in the output layer. The Conv2D, BatchNorm, MaxPool, LeakyReLU etc layers as well as the filter number, stride, and padding are taken right from the YOLO v2 paper, including the skip connection between conv2d_13 and conv2d_20 (not shown in this excerpt)

Topic		Replies	Views
Course4 Week3: Understanding YOLO Algorithm Convolutional Neural Networks coursera-platform	5	818	March 18, 2025
Object detection using yolo Convolutional Neural Networks coursera-platform	7	615	March 13, 2023
How does YOLO know if 3 cells make 1 object? Convolutional Neural Networks coursera-platform	3	616	August 14, 2023
Yolo Anchor Boxes Convolutional Neural Networks coursera-platform	13	1210	October 30, 2023
Week 3 - Car Detection Anchor Boxes Convolutional Neural Networks coursera-platform	14	946	July 11, 2023

Detecting Multiple Objects using YOLO - Grid Cells plus Anchor Boxes

Related topics