hello in the [C3_W1_LabPart2_Exercise2] where you want to compute the best f1 score there something that is confusing me despite spending nearly an hour trying to figure it out
the issue is : for the very first iteration of the values of epsilons , that particular value of epsilon will give predictions of zeros for all p_val and that’s fine, but that will correspond to (tp=0),(fp=0), which will result in the precision being 0 divided by 0 , so when i implemented this with my own code of if conditions and such it always gave an error dividing by zero , but when i implemented the code given in the exercise hints it worked and gave a nan value for the precision in that iteration and since recall will be also zero as (tp=0) , the f1 score also gave nan , can anyone please explain why it worked and what is nan ??? and sorry for the lengthy post
i also attached the code and the results for better calrification
Hey @Abbas_Mohammed,
An amazing catch indeed. I even tried doing all kinds of experiments with division of 0 by 0, and Python always gave me an error. However this code seems to be magical. It doesn’t give an error, just nan. As to what is nan, you can check out this post. As to why this code works, let me tag some other mentors, and they will help us out in understanding this.
Hey @rmwkwok and @SamReiswig, can you please help us understand why this code works? Thanks in advance.
Cheers,
Elemento
Hi!
The difference is caused by how Python and Numpy handle division.
In the backend Numpy uses a c/c++ library for it’s calculations.
This causes differences as shown below.
The code from the hint in the lab uses np.sum() to create tp, fp, fn.
This will convert them into the numpy.int64 type.
If we don’t use a numpy type in these calculations we get the following ZeroDivisionError:
Hope this helps clear things up!
Hey @SamReiswig,
That’s intriguing. Thanks a lot for sharing this, it really helped clear things up.
Cheers,
Elemento
Thank you @Elemento and @SamReiswig
I have the same error, but I don’t see the solution of getting rid of the 0s. For me predictions is always 0, therefore the best values cannot be determined.
Hey @Gabor_Farkas,
Getting the predictions as all 0s in the first iteration when epsilon = min(p_val) is not an error, this is how it is supposed to work. The next iteration onwards, you will get non-zero values.
Please read the thread that Abbas created more carefully. He is trying to point out the fact that despite of predictions being all 0s, when we calculate prec, it doesn’t give an error and instead give nan, for which Sam has posted the explanation. I hope this helps.
Cheers,
Elemento
Dear @Elemento ,
This might be a different error than indeed, as I am always ending up with 0s, therefore always failing the validation test. I’ll send you my code for validation.
Best,
Gábor
In case anyone is here in 2024, I was working on this assignment and came across a similar issue mentioned here; I would keep getting issues because of dividing by zero. Only after I changed how I was doing this to match what the Hints said (which is shown above, using np.sum) it worked, but I’m not sure why
Edit: I forgot to add, in my original implementation I was doing tp += 1, and so on for those 3 variables in their own loop; I changed things so that I made a predictions array like in the hints, and using np.sum.
I wonder if my old method was usinig regular python (non numpy) variables when dividing, and that somehow caused the issue with a divide by zero error instead of just getting nan.
Edit 2: So I found this numpy.nansum — NumPy v2.1 Manual ; this led to me wondering whether the regular numpy.sum uses nan for 0 values. I didn’t see anything saying such on the documentation. However, testing this in the terminal, there are some interesting results
>>> b = np.int32(0) / np.int32(0)
>>> np.sum([b, 5, 6])
np.float64(nan)
...
>>> b = np.int32(5) / np.int32(0)
>>> np.sum([b, 5, 6])
np.float64(inf)
..
>>> b = 0 / 5
>>> np.sum([b, 5, 6])
np.float64(11.0)
...
>>> b = 0 / 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
my guess is in some iiterations, division by zero happens; and the difiference is that with the python numbers, an error is thrown, whereas with teh numpy numbers it seems to just encapsulate it in a nan




