Noisy decrease in costs of L_Layered_network implementation

I have implement an L_Layered neural network after learning from the exercises but after training it on the same dataset as the exercises (DL Course 1, Week 4 , Deep Neural Network - Application notebook) I am getting a lot of noise in costs. I thought as the parameters get closer and closer to minimum, the learning rate is causing it to overshoot in some cases but after a few tweaks I saw that in some cases, the noise is during the middle of the graph (~iteration 700 to 2000 out of 2500) and other than that it is almost a smooth decrease in cost.
Would appreciate any help regarding this.

Costs vs iteration graph :


Jupyter notebook file :

CatDog Neural Network.ipynb (230.6 KB)

It is possible that this is because of the learning rate as you say. You could use tensorflows callbacks to decrease learning rate if certain conditions are met. Callback

There are also more sophisticated algorithms that manage the learning rate dynamically for you. But in general it is the case that the cost surfaces are incredibly complex. There’s never any guarantee that you’ll get nice smooth monotonic convergence from any particular random starting point with any particular algorithm.

Here’s a paper from Yann LeCun’s group about visualizing cost surfaces and dealing with those issues.

1 Like

I don’t know much about tensorflow usage so I’ll try to reduce the learning rate as a function of num_iterations.

Prof Ng will introduce us to a couple of algorithms for reducing the learning rate with increasing numbers of iterations in Course 2, Week 2, so please stay tuned for that. E.g. here’s his lecture about Learning Rate Decay and it is also covered in the C2 Week 2 programming assignment.

I’ll check that out! Thanks!