Question about ResNet-50 model

Jamal022 · August 27, 2023, 10:10am

Hey @usman.n, Thanks for your post.

Okay let’s break your question into points to cover.

First yes you’re correct the convolutional block has a stride of 1, it’s similar to an identity block where the spatial dimensions are preserved “Similar but not exactly the same”. But you need to consider the difference here which lies in the transformations applied within the convolutional block. In a convolutional block, the use of convolutions, batch normalization, and activation functions allows the network to learn more complex and non-linear transformations of the input data, potentially improving its representational power. So Convolutional blocks with a stride of 1 introduce more parameters and complexity compared to identity blocks. This increased complexity can allow the network to learn more intricate features and patterns. While identity blocks are helpful for maintaining information flow and alleviating the vanishing gradient problem, they might not provide as much capacity for feature extraction.

So now you got it why we use the convolutional block instead Identity one.

Now coming to next part of your question which is about “The use of resizing”.

The resizing operation you mentioned refers to the application of a 1x1 convolutional layer (CONV2D layer with a 1x1 filter) to the shortcut path (skip connection). This operation is used to match the dimensions of the output of the convolutional block with the dimensions of the input to the block.

Even when the stride is set to 1, the number of filters in the convolutional block might differ from the number of filters in the input. The 1x1 convolutional layer in the shortcut path is used to adjust the number of filters so that they match and can be element-wise added to the output of the convolutional block. This step ensures compatibility between the skip connection and the main path, allowing the addition of feature maps from both paths.

Hope it’s clear now and feel free to ask for more clarifications
Regards,
Jamal

Topic		Replies	Views
Doubts on ResNet-50 model implemented in Residual Network assignment Convolutional Neural Networks coursera-platform	1	541	June 14, 2022
Data loss in ResNetv50 Convolutional Neural Networks coursera-platform	2	515	October 26, 2021
[Data loss] Convolutional Block (1x1) with stride > 1 in ResNet50 Convolutional Neural Networks coursera-platform	1	547	May 14, 2022
Why use 1x1 Conv2d of stride 2 in resnet block? Convolutional Neural Networks coursera-platform	1	591	March 13, 2022
Week 2 ResNet programming exercise: the use of one-by-one convolution Convolutional Neural Networks coursera-platform	3	546	August 29, 2022

Question about ResNet-50 model

Related topics