When does mse becomes nan

I have a doubt, I want to know when does mse becomes nan. What I know about this when the parameters are initialised to 0, but in my parameter it was in the range 0 to 1.

the epoch ran normally until 98 and showed mse 487 and then at 99 epoch mse turned nan.

I am sorry again if it is a silly question.

Thank you

I’m not familiar with that course, but in general:

“nan” means “not a number”. This can happen if the training doesn’t converge, and the cost explodes to infinity. Or it can happen if you get a invalid operation, like divide-by-zero or taking the log of zero.

Often the cause would be too large a learning rate.

1 Like

isn’t learning rate of 0.02 higher? as I remember using learning rate of 0.001

Yes, 0.02 is greater than 0.001.