I have a doubt, I want to know when does mse becomes nan. What I know about this when the parameters are initialised to 0, but in my parameter it was in the range 0 to 1.

the epoch ran normally until 98 and showed mse 487 and then at 99 epoch mse turned nan.

Iâ€™m not familiar with that course, but in general:

â€śnanâ€ť means â€śnot a numberâ€ť. This can happen if the training doesnâ€™t converge, and the cost explodes to infinity. Or it can happen if you get a invalid operation, like divide-by-zero or taking the log of zero.

Often the cause would be too large a learning rate.