Hey @s_baghel_in,

I guess you are having an issue in understanding how the convolution works. I would suggest reviewing the lectures once more, since, I don’t see any issue in this. I am mentioning here a small explanation which might help you.

We will consider the same example that you are talking about, so:

Dimensions of A_prev = (2, 5, 7, 4)

Dimensions of A_prev_pad = (2, 7, 9, 4)

Dimensions of a_prev_pad = (7, 9, 4)

Dimensions of W = (3, 3, 4, 8) or (f, f, n_C_prev, n_C)

Dimensions of output = (3, 4, 8)

Using the generic formula for the output dimensions, which is:

Output \;size = floor((n + 2p - f) / s) + 1, where \; n = input \; size, p = padding, f = filter \; size, s = stride

We can verify the output format. Consider the height first. We have `n = 5, p = 1, f = 3, s = 2 -> Output = 3`

. Then, consider the width. We have `n = 7, p = 1, f = 3, s = 2 -> Output = 4`

. Here, remember that `a_prev_pad`

already includes padding, so, if we consider height `n + 2p = 7`

and if we consider width `n + 2p = 9`

, as I have stated above.

So, the height of the output is 3, and the width of the output is 4. Also, since, we have 8 kernels, the channels in the output will be 8. Therefore, the dimensions of the output will be `(3, 4, 8)`

. Here, I have assumed that when you are referring to `output`

, you are referring to the output corresponding to a single example. If we have multiple examples, as in this case 2, the dimensions of the `output`

will be `(2, 3, 4, 8)`

.

I hope this helps you.

Regards,

Elemento