what is the m in yolo algorithm for each image (m,608,608,3)
m is for image, as in members the collection of training images.
m is not related to the subject of your thread, by which I mean it has nothing to do with grid cells, grid cell size, or object size. The first part to understanding the question about ‘looking into grids’ is that it doesn’t look into grids. It looks at an entire image all at the same time. Then a bunch of predictions are made, including predictions about where object centers are and what their shapes are. The predicted center falls in exactly one grid cell. The predicted shape can be smaller, same size, or larger than one grid cell.
There are some other posts that cover the mechanisms for doing this. You might find them using search. For example, here is one.