Programming Assignment: Anomaly Detection inquiry

Abbas_Mohammed · August 21, 2022, 1:02pm

hello in the [C3_W1_LabPart2_Exercise2] where you want to compute the best f1 score there something that is confusing me despite spending nearly an hour trying to figure it out
the issue is : for the very first iteration of the values of epsilons , that particular value of epsilon will give predictions of zeros for all p_val and that’s fine, but that will correspond to (tp=0),(fp=0), which will result in the precision being 0 divided by 0 , so when i implemented this with my own code of if conditions and such it always gave an error dividing by zero , but when i implemented the code given in the exercise hints it worked and gave a nan value for the precision in that iteration and since recall will be also zero as (tp=0) , the f1 score also gave nan , can anyone please explain why it worked and what is nan ??? and sorry for the lengthy post
i also attached the code and the results for better calrification

Elemento · August 22, 2022, 4:56am

Hey @Abbas_Mohammed,
An amazing catch indeed. I even tried doing all kinds of experiments with division of 0 by 0, and Python always gave me an error. However this code seems to be magical. It doesn’t give an error, just nan. As to what is nan, you can check out this post. As to why this code works, let me tag some other mentors, and they will help us out in understanding this.

Hey @rmwkwok and @SamReiswig, can you please help us understand why this code works? Thanks in advance.

Cheers,
Elemento

SamReiswig · August 22, 2022, 12:42pm

Hi!

The difference is caused by how Python and Numpy handle division.
In the backend Numpy uses a c/c++ library for it’s calculations.
This causes differences as shown below.

The code from the hint in the lab uses np.sum() to create tp, fp, fn.
This will convert them into the numpy.int64 type.

If we don’t use a numpy type in these calculations we get the following ZeroDivisionError:

Hope this helps clear things up!

Elemento · August 23, 2022, 5:37pm

Hey @SamReiswig,
That’s intriguing. Thanks a lot for sharing this, it really helped clear things up.

Cheers,
Elemento

Abbas_Mohammed · September 8, 2022, 11:23pm

Thank you @Elemento and @SamReiswig

Gabor_Farkas · September 27, 2022, 9:05am

I have the same error, but I don’t see the solution of getting rid of the 0s. For me predictions is always 0, therefore the best values cannot be determined.

Elemento · September 27, 2022, 9:35am

Hey @Gabor_Farkas,
Getting the predictions as all 0s in the first iteration when epsilon = min(p_val) is not an error, this is how it is supposed to work. The next iteration onwards, you will get non-zero values.

Please read the thread that Abbas created more carefully. He is trying to point out the fact that despite of predictions being all 0s, when we calculate prec, it doesn’t give an error and instead give nan, for which Sam has posted the explanation. I hope this helps.

Cheers,
Elemento

Gabor_Farkas · September 27, 2022, 10:26am

Dear @Elemento ,
This might be a different error than indeed, as I am always ending up with 0s, therefore always failing the validation test. I’ll send you my code for validation.

Best,
Gábor

YodaKenobi · September 30, 2024, 7:07pm

In case anyone is here in 2024, I was working on this assignment and came across a similar issue mentioned here; I would keep getting issues because of dividing by zero. Only after I changed how I was doing this to match what the Hints said (which is shown above, using np.sum) it worked, but I’m not sure why

Edit: I forgot to add, in my original implementation I was doing tp += 1, and so on for those 3 variables in their own loop; I changed things so that I made a predictions array like in the hints, and using np.sum.

I wonder if my old method was usinig regular python (non numpy) variables when dividing, and that somehow caused the issue with a divide by zero error instead of just getting nan.

Edit 2: So I found this numpy.nansum — NumPy v2.1 Manual ; this led to me wondering whether the regular numpy.sum uses nan for 0 values. I didn’t see anything saying such on the documentation. However, testing this in the terminal, there are some interesting results

>>> b = np.int32(0) / np.int32(0)
>>> np.sum([b, 5, 6])
np.float64(nan)
...
>>> b = np.int32(5) / np.int32(0)
>>> np.sum([b, 5, 6])
np.float64(inf)
..
>>> b = 0 / 5
>>> np.sum([b, 5, 6])
np.float64(11.0)
...
>>> b = 0 / 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

my guess is in some iiterations, division by zero happens; and the difiference is that with the python numbers, an error is thrown, whereas with teh numpy numbers it seems to just encapsulate it in a nan

Topic		Replies	Views
Doubt in C3_W1_Anomaly_Detection Unsupervised Learning, Recommenders, Reinforcement week-module-1	5	608	September 28, 2023
Prog. assign.: Anomaly Detection - test for 2nd exercise is not accurate! Unsupervised Learning, Recommenders, Reinforcement week-module-1	7	525	August 30, 2022
C3_W1_Anomaly_Detection assignment exercise 2 Unsupervised Learning, Recommenders, Reinforcement week-module-1	12	861	February 10, 2023
I cannot figure out my problem - C3_W1_Anomaly_Detection Unsupervised Learning, Recommenders, Reinforcement week-module-1	3	37	November 11, 2024
"Unsupervised Learning "Course 3 week 1 practice lab Unsupervised Learning, Recommenders, Reinforcement week-module-1	3	53	July 14, 2024

Programming Assignment: Anomaly Detection inquiry

Related topics