General Convolution Question about filters/kernals

So I was going through some of the assignments and it seems like we use a 3x3, or 2x2 filter, and maybe channels upward of 8, 16, 32, etc…
but what exactly is inside the 3x3 filter or kernal? We learned that they could identify vertical or horizontal features but that’s 2 filter, so what about the remaining 6, 14, 30? how are they defined? is it just random generated numbers in the filter?

I also want to add, how are all the kernals defined in Tensor flow

This question just came up on another recent thread. Please have a look at this thread and it probably answers your question.

It is the same as when you write the algorithms directly in python: they are randomly initialized and then learned through training.

@hien_quoc it is so interesting that we both came up with the same question almost at the same time! I guess “great minds think alike” ? :slight_smile:

Our neurons was working together from miles apart, I had some activations and @paulinpaloalto help with skip learning so I can train my neurons with transfer learning. :laughing:

Thanks for the help so, seems like the more channels the better as long as we have the computational resources to handle it.

Oh, sorry, I don’t think we addressed the question of the number of channels in the previous discussion. The number of filters you have in a given conv layer is a choice you make. As Prof Ng describes in the lectures, the typical pattern as you proceed forward through the layers is that the height and width dimensions reduce and the number of channels increases. The intuitive way to think of this is that the network is distilling feature information from the geometric information and that the complexity of the features increases through the layers. Each filter is (we hope) learning to detect something different, because they start from a different set of random values. So in Prof Ng’s terminology, the numbers of output channels at each layer are “hyperparameters”, meaning values we need to choose, as opposed to “parameters”, e.g. the actual values of the individual weights and biases in the filters, which are automatically learned through back propagation.