Logic bug: convolutional_block() ignores a large fraction of its input?

paulinpaloalto · January 30, 2022, 7:10pm

Thank you very much for following up on this and resolving the questions!

As you say, if it’s only 3 out of the 50 layers where the downsampling happens, that loss of information apparently doesn’t spoil the results. If we were in an experimental mood, it might be interesting to try resetting the stride to 1 in those three cases and then following those layers with an average or max pooling layer and see if that makes a perceptible difference in the performance of the resulting models. That approach would increase the computational expense a bit, but lose less information.

Now that you’ve found the source to another implementation of Residual Nets, there was another really interesting technical question that came up in the last couple of weeks about how our implementation here in the notebook works: that concerns how it handles the “training” argument for the BatchNorm layers. Here’s a thread about that issue to see if it catches your interest!

Thanks again!
Regards,
Paul

Topic		Replies	Views
Week 2 ResNet programming exercise: the use of one-by-one convolution Convolutional Neural Networks	3	545	August 29, 2022
DLS Course 4 Week 2 Exercise 1: 1x1 convolution with strides=2 Convolutional Neural Networks	3	595	February 20, 2024
I have some questions about Residual Networks Exercise Convolutional Neural Networks	1	554	June 28, 2021
Data loss in ResNetv50 Convolutional Neural Networks	2	515	October 26, 2021
Why use 1x1 Conv2d of stride 2 in resnet block? Convolutional Neural Networks	1	588	March 13, 2022

Logic bug: convolutional_block() ignores a large fraction of its input?

Related topics