Computation of t_p, f_p, t_n (threshold detection in week 1, lab2 , exercise 2)

Dear colleagues,

I am not sure, how to compute the t_p, f_p an t_n values.
Did I have to summerize the values of the detected and non detected.

I am stuck, please give me a hint,
how to compute the values t_p, f_p and t_n

Thanks,

Juergen

Hi, @Juergen_Geiser!

Some comments about how to think through computing t_p, f_p and t_n:

The instructions for Exercise 2 explain what tp, fp, and tn are:

  • 𝑡𝑝 is the number of true positives: the ground truth label says it’s an anomaly and our algorithm correctly classified it as an anomaly.
  • 𝑓𝑝 is the number of false positives: the ground truth label says it’s not an anomaly, but our algorithm incorrectly classified it as an anomaly.
  • 𝑓𝑛 is the number of false negatives: the ground truth label says it’s an anomaly, but our algorithm incorrectly classified it as not being anomalous.

So, taking tp as an example: tp is the number of true positives, and a value is a true positive if the ground truth label says it’s an anomaly, and our prediction correctly classified it as an anomaly.

The select_threshold() function is given these parameters:

  • y_val - an ndarray of the ground truth labels
  • p_val - an ndarray of our predictions

The instructions for Exercise 2 also explains:

  • Recall that if an example 𝑥 has a low probability 𝑝(𝑥)<𝜀, then it is classified as an anomaly.

How can you use this info to find the number of true positives: the number of values where the ground truth says it is an anomaly and the probability also says it is an anomaly?

Once you have your idea of what you need to calculate, think about how you can implement it in code. If you get stuck on ideas for the coding, try clicking on the green “Click for hints” below the exercise. If that is still not enough, there will be some further hints you can click on for more detailed help on specific parts of the coding you might be stuck on.

Dear Wendy,

many thanks, it works perfectly, also with the fast coding of Python

By the way, I learned Python with a more, I would say “extended” coding (for-loops etc).
So, I see the short and compressed commands, which I could reconstruct in their
functions, e.g.:

Fast coding with:

Predictions = (p_val < epsilon)
(it creates an array with the dimension of p_val with 0 or 1, if p_val<epsilon is true, then 1, else 0)

or

fp = sum((predictions==1)& (y_val==0))
it summerize all the entries of the array (dimension oft he prediction), when prediction==1 and y_val==0

At least, I found out, how they will work.

But let me know a good book or pdf-file or online-sheet, where could I find such very
short commands of Python. That will improve me also in Python (I think, I have to learn
the very short and compressed commands, they are really very helpful)

Many thanks again, now I also found the tips related to the exercises, they are very helpful.

Best wishes

Juergen

1 Like

That’s great news, @Juergen_Geiser!

Good observations, too. The “fast coding” techniques you point out are ways to operate on a whole array or matrix, rather than looping through and operating on each value individually. Vectorized approaches like this typically run faster, in addition to being more concise. These come in very handy for machine learning where we use lots of matrices of data (tensors, arrays, …).

I don’t have any favorite book or online references for summarizing these types of techniques, but try googling something like “array operations python tips and tricks” or “python vectorization” to see what you find. Here’s one Medium article I noticed:

Also, check out the MLS Resources category which has a whole range of helpful info for learners in this course. You may find some useful python recommendations there. And, it’s generally a good resource to know about. You can either search for keywords at the main MLS Resource level, or go to the FAQ page and drill into whatever looks interesting or helpful from there.