CNN Construction

How to build a BAD CNN? What would make training results bad for a DL Task?

To build a bad cnn is easy, the question should be how to build a good one, and in that point a lot of specializations can help you. Check them out. Start with the Deep Learning specialization.

2 Likes

The answers could be many.

  • Not having an appropriate loss function
  • Or activation function
  • Or image distribution (what I mean is, having lets say 64x64 images for training but the test images to be lets say 1000x1000)
1 Like

I second what @gent.spah writes above. To make things even more complicated, an otherwise “good” CNN architecture can still do “bad” things if you let it iterate too many times during training, which can lead to overfitting. Meaning it learns so much about the training inputs that it can in effect reproduce them, but then cannot perform at the same level against new, previously unseen inputs, either in testing or in production. Learning too much and not learning enough are equally “bad” for a CNN.

1 Like

You make a great point here, there is no definitive formula approach to this. I was actually when I saw first the question going to give some of my ideas about the “good” CNN architecture, but you know there are so many things affecting it, its a science in itself and don’t think I can be more elaborate and expanding than the specializations here.

1 Like

Bad CNN is not the goal, but knowing the factors that can lead to poor training is essential. Avoiding the below mention issues might help.
Dealing with insufficient data: Few data for each feature might lead to overfitting.
Degraded data quality: The data which contain much noise or labels are incorrect, missing data can lead to a poor training model.
Dealing with too complex model might lead to overfitting.
Also dealing with inappropriate loss functions such as the use of regression loss in classification problems may hinder your training.
Dealing with incorrect hyperparameters (such as very poor learning rate, and batch sizes), not using dropout, batch normalization, dealing with imbalanced features, etc.
By avoiding some of these common mistakes we can make a good CNN.

1 Like

I think this is true only in a rather general sense. But to the best of my knowledge, there is no exact recipe one can advise or follow in all use cases. A ‘too complex model’ for one data set might be perfectly fine for another. Or one problem’s ‘too complex’ is another problem’s ‘too simple’. Learning rate is sometimes varied across a single training run; there isn’t a single ‘good’ value and all the rest are bad. Maybe a given CNN works beautifully in development on a powerful server but too computationally expensive for deployment on an edge device.

It might seem an unhelpful answer to the original question, but I would say that a ‘good’ CNN is one that achieves its design objectives for a given operational situation, and you find that by experimentation and analysis. A ‘bad’ CNN doesn’t run quickly enough or produce sufficiently accurate results on the target deployment hardware.

1 Like