Overfitting in W4_A2_Ex2

Hi @Ben,

Although I passed the exercise (thanks for your help!) it surprised me that in training it gets an accuracy of 100%, instead 98,5%, with the logical penalty of 74% in test instead the expected 80%.

What can be the reason?

Anyway I tested the model with cat pictures and it works wonderfully…

Thks

Hi Joan,

You answered the question yourself in the topic heading: overfitting. Well done! Neural networks are richly parameterized and therefore quite flexible. Even a single hidden layer network has the universal approximation property: it can approximate any nonlinear function to an arbitrary degree of accuracy given enough hidden units. So here we have a natural setup for overfitting: a
very flexible model with a small number of training examples.

You also know the “tell” for overfitting: the out-of-sample accuracy (test set predictions) is miserable compared to the training set accuracy. We can do a couple of things. First, make the training set bigger with more examples, and two, apply a “regularization” technique. In the simplest case, regularization alters the cost function so that in large weights are penalized in training. In other words, they are shrunk towards zero. This is a large focus of Course 2.

So stay tuned, more fun to come. And keep up the great work!

Best,
Ken

1 Like

The other thing worth saying here is that the reason your results are different than the ones shown in the notebook for the L Layer model is that you forgot to run the cell that defines layers_dims for the 4 layer net. So you ran the 4 layer training with the same 2 layer net definition used in the earlier part of this notebook. The reason the results are slightly different than the 2 layer results is that we use a different (more sophisticated) initialization method in the 4 layer case.

Hi Paul,

Exactly, this was the result of going through the notebook hastily without respecting the execution of each step. I repeated the operation carefully, letting the kernel it’s time to perform step by step and the result has been exactly as expected.

Many thanks for your explanation that clarified the reason behind the surprising result.

Missatge de Paul Mielke via DeepLearning.AI <dlai@discoursemail.com> del dia dj., 14 d’oct. 2021 a les 21:18:

Hi, Joan.

That’s great news! Thank you for confirming the result.

Regards,
Paul