Week 02 - 6.1 Mini-Batch Gradient Descent β†’ Why not zig zac cost

After running the Mini-batch gradient descent, the cost is suppose to zigzag as in the lecture. However, this is what I got:

Is this because the batch size is relative small so the zigzag is not so big and make it looks like a smooth line?

Thank you

There is no guarantee that the cost will oscillate if you use small mini-batches. It can happen, but it’s not guaranteed to happen. It all depends on the properties of your data and the model you have specified. Of course the values of other hyperparameters like the learning rate are influential here as well.

1 Like