In this video, when Prof. Ng says that we are going to pre-define two different shapes, called anchor boxes. How are we going to do that?
Does that mean that we are going to fix the values or the aspect ratio of bh and bw, such that, we have two bounding boxes, for each grid, that are of the shapes similar to the anchor boxes shown in the image. And then, based on the values pre-defined for bh and bw, the model predicts the objects accordingly?
As far as defining anchor boxes is concerned, Andrew mentions towards end of this lecture that we use a technique like K-means to find common shapes / pick 5-10 shapes that span a variety of shapes we want to detect.
While we have shapes of bounding boxes influence the output especially when deciding center of the object, it’d be helpful if you could rephrase your question after watching this lecture which explains how to interpret a single prediction.
I understand that we may use k-means or manually pick a few shapes to use for anchor boxes. But that was not my question of how to select anchor boxes. My question was that once we have selected the anchor boxes, let’s say, I want to use 3 anchor boxes, one is tall, one is wide, and one is square. Now, how do I use those anchor boxes in my model? How and where do I assign the numerical values of these anchor boxes in the model?
Please work on course 4 week 3 assignment 1 (autonomous driving car detection) where you’ll make use of YOLO to answer the question you’ve asked.
You might find answers or ideas for refining your questions in previous threads on the topic…
Let us know what you find out