Vanishing / Exploding Gradients : week1

Hi Sir,

Proff told weighs are very small, graident could be smaller so it takes lot of tiny little steps to descent downwards.

But proff didnt told the other case, what happens if the gradient exploids ( when weights are large ) …how does it impact the training to be difficult ?

I again advice you to experiment with where you can try yourself what happens with vanishing or exploding gradients :slight_smile:


Hi Sir,

I can see if the weight too large, cost function plot curve not comes down and also it goes up down but after sometimes it starts convergence.

So weights too large means leads to divergence right ? Divergence means , curve not comes down and also up down up down

In the case of exploding gradients, the accumulation of large derivatives results in the model being very unstable and incapable of effective learning, The large changes in the models weights creates a very unstable network, which at extreme values the weights become so large that is causes overflow resulting in NaN weight values of which can no longer be updated.