Please see picture below.
input: n_h_L-1 * n_w_L-1 * n_c_L-1
output: n_h_L* n_w_L * n_c_L
So the last dimension n_c_L-1 in the Input is the RGB channel (3), but in the output the n_c_L is the number of filters in the output volume, although in both cases they are called channels, or depths.
So the two “channels” have very different meanings. Is that right? In addition, for the channel in the input, is the RGB channel (3) is the most common value? are there any other possibilities for this value?
Yes, at every internal layer of a ConvNet which is a “conv” layer, there are the input channels and the output channels. You will soon learn that there are other types of internal layers in a ConvNet besides “conv” layers: there can be pooling layers and fully connected layers as well. But convolution layers have an input which has a certain number of channels. If you’re dealing with the very first conv layer and the input is images, then there will typically be 3 channels in the input for RGB images. But if the inputs are greyscale images, then there will be only one input channel. If it is CMYK images or PNG images, there may be 4 input channels. Then the number of output channels is determined by the number of “filters” you define in that layer. Each filter matches the number of input channels and each filter (when applied to its input) produces one output channel. Therefore the number of filters determines the number of output channels. The number of filters is a “hyperparameter”, meaning a design choice made by the system designer. If the first layer outputs 8 channels, then the second layer will have 8 input channels. And so forth …
Prof Ng will discuss all this in more detail as we proceed through Week 1 of DLS Course 4.
“ Each filter matches the number of input channels and each filter (when applied to its input) produces one output channel”
@paulinpaloalto What do you mean by ‘match’ here? You mean ‘element-wise multiplication’? I understand all other parts of your answers and thanks for that.
Prof Ng explains in the lectures what the atomic operation of convolution is. If you missed that, it would be a good idea to watch the lectures again from the beginning. It is elementwise multiplication between the filter and a particular position in the input (subject to stride) which includes the height and width and (input) channel dimensions, followed by the summation of those products and the addition of a bias term.
Yes，thanks for the explanation.