For case 1), my theory is that you are really talking about bounding boxes, as opposed to anchor boxes. The bounding boxes are part of the YOLO output and they do include both size and position. The anchor boxes are essentially input to the computation and are used as a way to both a) make the algorithm more efficient and b) organize the output. They are essentially just “aspect ratios” and are not tied to a particular location. There are many excellent posts on the forum from ai_curious which add a lot of context to the material in the course. Here’s a good one to start from on the topic of Anchor Boxes.
For item 3), you have two rectangles and you compute the intersection of them. Now think about the upper left corners of both rectangles. If a point is to be in both rectangles (the definition of intersection), then its coordinates must be downwards and to the right of the coordinates of both of those upper left corners, right? That means that the x coordinate must be further to the right and the y coordinate must be closer to the bottom than the corresponding coordinates of both upper left corners. Similarly for the reasoning about the lower right corners: all points in the intersection must be above and to the left of both lower right corners.
Of course the intersection may be trivial, but I believe that the descriptions in the text in the notebook are correct.
For point 2), I think that is reasonable terminology. An object is an element of a particular class, so you can refer to it either way. “Object” and “Class” are just two ways to say the same thing when you are just talking about the algorithm in words. Of course when you get to writing the code, then you need to be perfectly unambiguous. ![]()