Since there are multiple windows for sliding window detection, and Andrew NG has pointed out that the object in the image of training set should fit within the frame
Since the crop image size for each window size is different.
My question is that “When It forward propagation, will every window will be resized into the same size which fit with the input size of our CNN model? like the following”
One more question about this is that “Does this imply that the training set can have many different size in images, but when they train by the model, they will always pass resize() function to fit with the input size of the model?” I’m I right?
So, For Truck image, when it got resize, The appearance will be compress vertically then pass to the model
Yes, you are thinking correct, but just so that we are on the same page, when you are implementing the Sliding Windows in the conventional way, as discussed in the lecture entitled “Object Detection”, the multiple windows are present only during the inference time. At the training time, we have a simple ConvNet to train, which always expects it’s inputs to have a uniform size.
So, during the training time, each cropped image will be resized to the size expected by the ConvNet.