As the picture shows, Andrew said there were six filters for the transformation between image1(32 * 32 * 3) and image2(28 * 28 * 6).
So, I think the shape of each filter should be 5 * 5 * 3, because the image has its RBG channels.
Then just like I purposed before, the shape of each filter between image2 and image3(14 * 14 * 6) should be 2 * 2 * 6 with image2’s number of channels as the value of the third dimension.
Is that right?
In a given image, f = 5 means we are using 5 filters (not necessarily 5 by 5 dimensions). The shape of f is our choice, whether to choose 3 by 3 or 5 by 5, or 7 by 7, (usually an odd number).
Regarding the 3rd dimension, you are right that it is 3 and 6.
Updated: Here, f = 5 means a 5 by 5 filter and Prof. Andrew explicitly mentioned the number of filters is 6.
Thanks for your reply, but I can’t understand, if “f = 5” means that the number of filters between image1 and image2 is five, then each filter will output one page of image , there are 5 filters, so these filters will output 5 pages in total, finally the shape of image2 should be 28 * 28 * 5 which is conflicted with 28 * 28 * 6. why?
What the video said, is f represents the number of filters or dimensions? Can you share a link of a video?
The image is correct, f = 5 means a 5 by 5 filter and Prof. Andrew explicitly mentioned the number of filters is 6. Watch the video again.
OK, I got it, thanks for your reply.