- image taken from MobileNet lecture video
The # of filters has gotten me confused. When you convolute the 3x3x3 filter into the image then the resulting 4x4 grid has the values we want. If we make 5 of these filters, aren’t they just repeated data 5 times?
No, it’s not repeated data. In the beginning, weight for each filter are randomly initialized and will result in different output. During back propagation, each filter recieves gradient updates only from the corresponding output channel. This process continues for forward and back prop passes of subsequent epochs which results in different filters and different output channels.
Exactly! Because we “break symmetry” by initializing all the filters randomly, all the filters end up learning different things during training. Prof Ng will explain this in more detail as he goes through the lectures here in Week 1 and then he will show some very cool work that allows you to see what is happening in the hidden layers of the network in terms of what is being learned in a lecture in Week 4 titled “What are Deep ConvNets Learning?”. You can even preview that after watching some of the lectures in Week 1 and it will make at least some sense. But definitely look forward to that if you prefer to wait and watch it “in sequence”.