Lecture - One Layer of CNN - Notations: # of filters should not be = # of channels

It seems that same notation is used for # of channels and # of filters n (super [l] sub (c)).

As I recall # of filters would depend upon various edges/patterns to be detected (vertical edge fiter, horizontal edge filter, 70 degrdge filter …)

So it would seem that Summary of Notations slide (around 13 min) may not be accurate,

It is correct that number of filters is the number of output channels. That’s the point to keep clear: there are two sets of channels input and output. The shape of each filter matches the number of input channels and the total number of filters determines the number of output channels in a convolution layer.

I may be getting mixed up here. I thought:

  • for each filter, # of channels = # of colors (= 3) - oh - this would be just for Layer = 1

  • and # of filters = # of edge types or other shapes we want to detect

  • Ok so the output, in the example used, would have # of channels = # of filters (4 x 4 x 2 – where two 4x4x1 are stacked).

What about this?

  • a colour is a channel, but a channel does not have to be a color in general
  • a filter produces a channel, so # filters = # channels

Yes, that sounds right. The number of input channels is typically only 3 for the very first conv layer, if the input is an image with 3 colors. Most of the layers after that will have more input channels.

Notice that the notation in that section of the lecture uses n_c^{[l-1]} as the number of input channels, that is to say the number of channels output by the previous layer l - 1. Then the number of output channels is n_c^{[l]} for layer l.

Here’s the relevant slide:

So the number of filters in layer l is n_c^{[l]} which then becomes the number of output channels. Each one has n_c^{[l-1]} as its third dimension, the number of input channels.