Overfitting in W4_A2_Ex2

Joansitges · October 14, 2021, 9:33am

Although I passed the exercise (thanks for your help!) it surprised me that in training it gets an accuracy of 100%, instead 98,5%, with the logical penalty of 74% in test instead the expected 80%.

What can be the reason?

Anyway I tested the model with cat pictures and it works wonderfully…

Thks

kenb · October 14, 2021, 12:08pm

Hi Joan,

You answered the question yourself in the topic heading: overfitting. Well done! Neural networks are richly parameterized and therefore quite flexible. Even a single hidden layer network has the universal approximation property: it can approximate any nonlinear function to an arbitrary degree of accuracy given enough hidden units. So here we have a natural setup for overfitting: a
very flexible model with a small number of training examples.

You also know the “tell” for overfitting: the out-of-sample accuracy (test set predictions) is miserable compared to the training set accuracy. We can do a couple of things. First, make the training set bigger with more examples, and two, apply a “regularization” technique. In the simplest case, regularization alters the cost function so that in large weights are penalized in training. In other words, they are shrunk towards zero. This is a large focus of Course 2.

So stay tuned, more fun to come. And keep up the great work!

Best,
Ken

paulinpaloalto · October 14, 2021, 7:08pm

The other thing worth saying here is that the reason your results are different than the ones shown in the notebook for the L Layer model is that you forgot to run the cell that defines layers_dims for the 4 layer net. So you ran the 4 layer training with the same 2 layer net definition used in the earlier part of this notebook. The reason the results are slightly different than the 2 layer results is that we use a different (more sophisticated) initialization method in the 4 layer case.

Joansitges · October 15, 2021, 8:19pm

Hi Paul,

Exactly, this was the result of going through the notebook hastily without respecting the execution of each step. I repeated the operation carefully, letting the kernel it’s time to perform step by step and the result has been exactly as expected.

Many thanks for your explanation that clarified the reason behind the surprising result.

Missatge de Paul Mielke via DeepLearning.AI <dlai@discoursemail.com> del dia dj., 14 d’oct. 2021 a les 21:18:

paulinpaloalto · October 15, 2021, 8:42pm

Hi, Joan.

That’s great news! Thank you for confirming the result.

Regards,
Paul

Topic		Replies	Views
W4_Overfitting of the Model vs Training Accuracy Neural Networks and Deep Learning	6	593	April 8, 2023
DLS1 week 4 assignment 2: 2 hidden layers VS 4 - sharing views Neural Networks and Deep Learning	3	531	August 30, 2021
Course 1, week 4 assignment 2 Neural Networks and Deep Learning	1	541	December 22, 2021
Week3_Performance on other dataset Neural Networks and Deep Learning	3	587	August 18, 2021
Week 3 Assignment - Overfitting without test data Neural Networks and Deep Learning	3	548	November 12, 2021

Overfitting in W4_A2_Ex2

Related topics