Week 1, https://www.coursera.org/learn/convolutional-neural-networks/lecture/A9lXL/simple-convolutional-network-example timestamp 3:12
The first layer, which takes in the input image (39 * 39 * 3), has 10 filters of size of 3. Does that mean 10 filters of 3 * 3 * 3? it would make sense since we have three channels in input.
But in the following layer, we have 20 filters of size 5,
now, here is my question: for the second filter, do we have 20 filters of 5 * 5 * 10 or 20 filters of 5 * 5 applied to each of the 10 channels of the last layer?
I hope my words can describe my confusion here. Any help is appreciated.