Course 2 Week 3: compute cost solution is wrong?

Phong_Dong · November 2, 2022, 11:36pm

In the Lecture the Cost function ist the sum of Losses divided by the number of examples.
So I was thinking to divide by two after tf.reduce_sum(…) .
It’s only graded correct if you don’t divide it.

paulinpaloalto · November 3, 2022, 12:47am

Yes, it is true that the cost is the average of the loss over all the values in the training set. The problem is that things get a bit more complicated once you switch to supporting Minibatch Gradient Descent. The way that they handle that is to convert the lower level cost function to return the sum of the losses, rather than the average. Then they sum those over all the minibatches. When they get to the end of the full Epoch (all the minibatches), then they divide the sum by m to get the overall average. The reason that the average doesn’t work at the level of the minibatch is that all the minibatches are not the same size in the case that the minibatch size does not evenly divide the total training set size. You can’t take the average of the averages in that case, right? The math doesn’t work …

Take a look at the details of how the cost is handled in the Optimization assignment in C2 W2 if you missed that level of detail the first time through.

paulinpaloalto · November 3, 2022, 12:49am

The instructions specifically tell you to use reduce_sum for the reason that I explained in my previous reply.

Topic		Replies	Views
Course2_week2_assignment Improving Deep Neural Networks: Hyperparameter tun	1	610	June 28, 2021
Why take cost average in Gradient Descent? Improving Deep Neural Networks: Hyperparameter tun	2	532	April 28, 2022
About DLS Course 2 Week 3 programming exercises Improving Deep Neural Networks: Hyperparameter tun	3	653	November 2, 2022
The computation of the cost function: compute_cost() Improving Deep Neural Networks: Hyperparameter tun	9	760	February 11, 2023
C2 W2 / Epoch cost / Exercise 6 Improving Deep Neural Networks: Hyperparameter tun	1	503	March 5, 2022

Course 2 Week 3: compute cost solution is wrong?

Related topics