Relation between Accuracy and Cost in Week 4 Assigment

paulinpaloalto · August 29, 2022, 5:18pm

Very interesting! It’s great that you are doing this type of investigation. There’s always something interesting to learn. I agree that it doesn’t seem logical that the test cost would increase in the way that you show. Let’s dig in and see what more we can learn here!

First there are a couple of general things to say:

Yes, you’re right that all this is overfitting. And maybe the bigger problem is that this whole situation is pretty unrealistic in that the dataset is way way too small to give a generalizable solution to a problem this complex. Here’s a thread which discusses that point in a bit more detail and shows that the dataset is very carefully curated to give the results as good as we see here.
The relationship between cost and accuracy is not as straightforward as you might think at first glance. The high level point is that accuracy is quantified, but the cost isn’t. What I mean by that is illustrated by the example of a sample with a label of 1. If the \hat{y} value after 1000 iterations is 0.52, then the answer is already correct. But if after 2000 iterations, the \hat{y} value is 0.75, then the cost will be lower, but the accuracy is still the same. Of course it could also go the other direction: going from 0.75 to 0.52 in a later iteration will give you a higher cost with the same accuracy, which is what seems to be happening with the test data in your case.
It’s really only accuracy that we actually care about. The actual J value doesn’t really tell you that much as we see from item 2), but there still is something puzzling in the behavior here that is worth investigating.

As far as I can see so far, your code looks completely correct. You could have simplified it a bit by using np.mean to compute the accuracy values. It would also be more efficient to rewrite the code to pass in the iteration numbers where you want the checkpoints and then you’d only have to run the training once, but I totally get why you did it the way you did: my way would be a big rewrite to the core functions, which just messes everything up and introduces more complexity.

Ok, none of the above really answers anything yet, but this is just the next step after your interesting steps above. More investigation required. Next I want to dig in a bit and actually look in more detail at the test cost numbers.

Topic		Replies	Views
Course 1, Week 2 Assignment Neural Networks and Deep Learning coursera-platform	6	720	November 3, 2021
DLS1 week 4 assignment 2: 2 hidden layers VS 4 - sharing views Neural Networks and Deep Learning coursera-platform	3	531	August 30, 2021
Course 3 Week1: put it into practice, cat classification Structuring Machine Learning Projects week-module-1 , coursera-platform	22	533	January 31, 2024
Course 1, Week 4. Parameters and hyper-parameters Neural Networks and Deep Learning coursera-platform	12	596	June 9, 2021
Course 1: Week 4, DNN Application, Accuracy Neural Networks and Deep Learning coursera-platform	1	577	May 21, 2021

Relation between Accuracy and Cost in Week 4 Assigment

Related topics