Why we think adding more conv layers is better for the models?

This is just a general question,

Why adding more conv. layers is better? when we see the pictures, it looks like it is really hard to know what a picture has (a dog or a cat). I know that adding layers reduces the image size, but I think we should try to have only a couple of layers, not more.

It’s a good question, but there isn’t an easy answer. There are a number of variables or “hyperparameter” choices one can make: it’s possible to preserve the image size in a “conv” layer if you use “same” padding and “stride” of 1, so adding additional layers does not necessarily imply reducing the size of the height and width dimensions.

But generally speaking the pattern is that the image size reduces and the number of output “channels” grows as you go through the network. The idea is that each conv layer “distills” more information from the output of the previous layers. The way Prof Ng describes it is that you can think of the early layers as detecting low level features like edges and the later layers “integrate” that knowledge into the recognition of higher level features, e.g. two edges which meet at a certain angle might be a cat’s ear. Then later if you see two objects that looks like cat ears positioned near each other, maybe you’ve found a cat’s head.

But then the question is “how many layers is enough”? There is no “one size fits all” answer to that question. It depends on the complexity of the task at hand. So you need to try experiments to figure out what will be sufficient for your particular problem. Of course the first step is you hope that you can find a paper or some other example of someone who has solved a problem that is at least somewhat similar to yours. If you’re lucky enough to find that, then you start with a similar architecture and see what happens.

So as I said at the beginning, it’s a crucial question, but there is no simple answer. For now, the best idea is to “hold that thought” and listen to all that Prof Ng says and all that we learn in the assignments here in DLS Course 4. We’ll see quite a few examples of networks designed to solve different types of problems. What you will notice is that some of the solutions involve quite a few layers.