Programming Assignment: Anomaly Detection inquiry

hello in the [C3_W1_LabPart2_Exercise2] where you want to compute the best f1 score there something that is confusing me despite spending nearly an hour trying to figure it out
the issue is : for the very first iteration of the values of epsilons , that particular value of epsilon will give predictions of zeros for all p_val and that’s fine, but that will correspond to (tp=0),(fp=0), which will result in the precision being 0 divided by 0 , so when i implemented this with my own code of if conditions and such it always gave an error dividing by zero , but when i implemented the code given in the exercise hints it worked and gave a nan value for the precision in that iteration and since recall will be also zero as (tp=0) , the f1 score also gave nan , can anyone please explain why it worked and what is nan ??? and sorry for the lengthy post
i also attached the code and the results for better calrification


Hey @Abbas_Mohammed,
An amazing catch indeed. I even tried doing all kinds of experiments with division of 0 by 0, and Python always gave me an error. However this code seems to be magical. It doesn’t give an error, just nan. As to what is nan, you can check out this post. As to why this code works, let me tag some other mentors, and they will help us out in understanding this.

Hey @rmwkwok and @SamReiswig, can you please help us understand why this code works? Thanks in advance.

Cheers,
Elemento

1 Like

Hi!

The difference is caused by how Python and Numpy handle division.
In the backend Numpy uses a c/c++ library for it’s calculations.
This causes differences as shown below.

The code from the hint in the lab uses np.sum() to create tp, fp, fn.
This will convert them into the numpy.int64 type.

If we don’t use a numpy type in these calculations we get the following ZeroDivisionError:

Hope this helps clear things up!

3 Likes

Hey @SamReiswig,
That’s intriguing. Thanks a lot for sharing this, it really helped clear things up.

Cheers,
Elemento

1 Like

Thank you @Elemento and @SamReiswig

I have the same error, but I don’t see the solution of getting rid of the 0s. For me predictions is always 0, therefore the best values cannot be determined.

Hey @Gabor_Farkas,
Getting the predictions as all 0s in the first iteration when epsilon = min(p_val) is not an error, this is how it is supposed to work. The next iteration onwards, you will get non-zero values.

Please read the thread that Abbas created more carefully. He is trying to point out the fact that despite of predictions being all 0s, when we calculate prec, it doesn’t give an error and instead give nan, for which Sam has posted the explanation. I hope this helps.

Cheers,
Elemento

Dear @Elemento ,
This might be a different error than indeed, as I am always ending up with 0s, therefore always failing the validation test. I’ll send you my code for validation.

Best,
Gábor