Questioning the way of computing epoch_cost

In week 3 assignment, part 3.3 - Train the Model, within the model function, the epoch_cost is computed as:
epoch_cost += minibatch_cost / minibatch_size

And I think it should be computed as:
num_minibatch = np.ceil(num_examples / minibatch_size)
epoch_cost += minibatch_cost / num_minibatch

Because when minibatch_cost is computed by function compute_cost, it has been averaged on the number of minibatch size. So, when computing the epoch_cost, it should be averaged on the number of minibatch, not the size of minibatch.

Hope for a discussion on this.

1 Like

Hi @Damon , this is an interesting question.

As long as the mini-batch cost function is correct (it’s the one used in the back-prop), I haven’t given much thought to the epoch-cost function when using mini batches.

I’m also interested in hearing other opinions about this.

I think @Damon is right. At least if you expect to consistently see values within the same order of magnitude as those returned by compute_cost (in the assignment, minibatch_size and num_minibatch happen to be very close).

Although the shape of the learning curve doesn’t change if you scale it by a constant factor, dividing by minibatch_size doesn’t seem to make sense :thinking:

1 Like