FloatingPointError: invalid value encountered in long_scalars

chandan1986.sarkar · June 24, 2021, 7:56pm

Hi Fellow Learners / Mentors,

I am currently in course 3, working on my own complete implementation, based on what I have learned so far in the specialization and also trying to apply the lessons of course 3 wherever I can.

Very frequently I am encountering an issue in numpy while trying different values of hyperparameters like learning rate and regularization parameters. It says,

FloatingPointError: invalid value encountered in long_scalars

Based on what I found so far on this topic is, most likely for some computations floating-point overflow is happening causing some NaN values. I was using np.float64 dtype for my implementations. I tried using np.float128 but that did not help much.

I should mention, that I am initializing my parameters exactly as we did in the assignments with random and he initialization. This issue does not happen always. For example, in my mini-batch gradient descent implementation, this happens when the number of epochs are in the range 2000 with a learning rate: 0.0007 and it goes away when the number of epochs are increased to the range of 5000 and I even get a decent outcome from the training.

One takeaway point here is, that we have to try multiple values for epochs and all of our hyperparameters. On the other hand, I want to understand, if others also have encountered such issue while training the model.

I have never seen this problem while working on any of the assignments. I am wondering if this is a common observation in practice, and what I can do to prevent this?

Thanks & Regards,
Chandan.

carloshvp · June 27, 2021, 7:57am

Hello @chandan1986.sarkar ,
While I don’t know exactly what you are trying to do, I made a fast google search and found several references to dividing by zero. Could this be the problem?

chandan1986.sarkar · June 27, 2021, 1:25pm

Hi @carloshvp, thanks a lot for the response and your time to look into this a bit. Yes, I also have tried to search a bit on the StackOverflow and found several ZeroDivision references. I am actually trying to create a project of binary classification from the scratch, identical to what we have learned from Course 1 and 2, essentially taking motivations from them. So far I have made some improvements.

I identified, there was a mistake in my cost function and how I am averaging the costs for mini-batch gradient descent. Earlier I was dividing the cost by total examples m for each minibatch, which is most likely wrong.
I corrected few other smaller mistakes as well and got a better situation now but the issue has not gone away completely. By better situation, I am saying, that there are lesser instance of this error and my cost estimation plots are looking identical to that of assignments now. But I still get this issue from time to time in some cases.

I initially thought, that because I am iterating the gradient descent process for a higher number of times (e.g., 5000 - 10000), some of my weights are becoming almost 0 but then I thought, we also do that in our assignments for quite a few cases but never face this issue. So it is very much possible, that there are still some mistakes in my implementation, which I overlooked.

So I was wondering if anyone faced similar problems before, and if yes, what was the outcome of the troubleshooting.

Thanks & Regards,
Chandan.

Topic		Replies	Views
C1_W1_Lab04_Gradient_Descent: Overflow Error Supervised ML: Regression and Classification week-module-1	4	534	October 26, 2022
[C2A1 Programing Assignment] Wrong values. It happens if the order of X rows(features) changes Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	656	November 4, 2021
'numpy.float64' object is not callable Supervised ML: Regression and Classification week-module-2	5	604	August 17, 2022
Exercise 4 - Forward Propagation Calculus for Machine Learning and Data Science week-module-3	4	459	April 11, 2023
Issues while trying to use a simple Linear Regression on Titanic Competition on Kaggle Supervised ML: Regression and Classification week-module-1	6	81	November 22, 2024

FloatingPointError: invalid value encountered in long_scalars

Related topics