Convolutional sliding windows

I think you need to investigate why the output shape of the efficientnetb1 becomes different from that in model1 in such way.

I think the model is only taking images with stride of (32,32), so I don’t think I’ll be able to do this with a built-in state-of-the-art model. Do you know how I can get access to the layers of a model so I can make some tweaks, but still maintain good accuracy? When I do a summary of a model using EfficientNet, it just represents EfficientNet as one layer, without showing its contents.

On efficientnetb1’s doc page, if you “view source on GIthub” and back-trace from there, you will get to this function which I think is creating the network. Perhaps you can make a copy and modify it.