For the course "CNN Example"


As the picture shows, Andrew said there were six filters for the transformation between image1(32 * 32 * 3) and image2(28 * 28 * 6).
So, I think the shape of each filter should be 5 * 5 * 3, because the image has its RBG channels.
Then just like I purposed before, the shape of each filter between image2 and image3(14 * 14 * 6) should be 2 * 2 * 6 with image2’s number of channels as the value of the third dimension.

Is that right?

In a given image, f = 5 means we are using 5 filters (not necessarily 5 by 5 dimensions). The shape of f is our choice, whether to choose 3 by 3 or 5 by 5, or 7 by 7, (usually an odd number).

Regarding the 3rd dimension, you are right that it is 3 and 6.

Updated: Here, f = 5 means a 5 by 5 filter and Prof. Andrew explicitly mentioned the number of filters is 6.

Thanks for your reply, but I can’t understand, if “f = 5” means that the number of filters between image1 and image2 is five, then each filter will output one page of image , there are 5 filters, so these filters will output 5 pages in total, finally the shape of image2 should be 28 * 28 * 5 which is conflicted with 28 * 28 * 6. why?

What the video said, is f represents the number of filters or dimensions? Can you share a link of a video?

The image is correct, f = 5 means a 5 by 5 filter and Prof. Andrew explicitly mentioned the number of filters is 6. Watch the video again.

OK, I got it, thanks for your reply.