Hey @s_baghel_in,
I guess you are having an issue in understanding how the convolution works. I would suggest reviewing the lectures once more, since, I don’t see any issue in this. I am mentioning here a small explanation which might help you.
We will consider the same example that you are talking about, so:
Dimensions of A_prev = (2, 5, 7, 4)
Dimensions of A_prev_pad = (2, 7, 9, 4)
Dimensions of a_prev_pad = (7, 9, 4)
Dimensions of W = (3, 3, 4, 8) or (f, f, n_C_prev, n_C)
Dimensions of output = (3, 4, 8)
Using the generic formula for the output dimensions, which is:
Output \;size = floor((n + 2p - f) / s) + 1, where \; n = input \; size, p = padding, f = filter \; size, s = stride
We can verify the output format. Consider the height first. We have n = 5, p = 1, f = 3, s = 2 -> Output = 3
. Then, consider the width. We have n = 7, p = 1, f = 3, s = 2 -> Output = 4
. Here, remember that a_prev_pad
already includes padding, so, if we consider height n + 2p = 7
and if we consider width n + 2p = 9
, as I have stated above.
So, the height of the output is 3, and the width of the output is 4. Also, since, we have 8 kernels, the channels in the output will be 8. Therefore, the dimensions of the output will be (3, 4, 8)
. Here, I have assumed that when you are referring to output
, you are referring to the output corresponding to a single example. If we have multiple examples, as in this case 2, the dimensions of the output
will be (2, 3, 4, 8)
.
I hope this helps you.
Regards,
Elemento