Hi,
I was trying to solve for the select_threshold function lab question with the following code:
def select_threshold(y_val, p_val):
“”"
Finds the best threshold to use for selecting outliers
based on the results from a validation set (p_val)
and the ground truth (y_val)
Args:
y_val (ndarray): Ground truth on validation set
p_val (ndarray): Results on validation set
Returns:
epsilon (float): Threshold chosen
F1 (float): F1 score by choosing epsilon as threshold
"""
best_epsilon = 0
best_F1 = 0
F1 = 0
step_size = (max(p_val) - min(p_val)) / 1000
for epsilon in np.arange(min(p_val), max(p_val), step_size):
### START CODE HERE ###
tp = 0
fp = 0
fn = 0
rec = 0
prec = 0
for i in range(len(p_val)):
if(p_val[i]< epsilon):
p_val[i] = 1
else:
p_val[i] = 0
if p_val[i]== 1 & y_val[i] == 1:
tp+=1
elif p_val[i] == 1 & y_val[i] == 0:
fp+=1
elif p_val[i] == 0 & y_val[i] == 1:
fn+= 1
rec = tp/(tp+fn)
prec = tp/(tp+fp)
F1= 2*prec*rec/(prec+rec)
### END CODE HERE ###
if F1 > best_F1:
best_F1 = F1
best_epsilon = epsilon
return best_epsilon, best_F1
however I am getting the following error:
ZeroDivisionError Traceback (most recent call last)
in
1 p_val = multivariate_gaussian(X_val, mu, var)
----> 2 epsilon, F1 = select_threshold(y_val, p_val)
3
4 print('Best epsilon found using cross-validation: e' epsilon)
5 print('Best F1 on Cross Validation Set: f' F1)
in select_threshold(y_val, p_val)
42 fn+= 1
43
—> 44 rec = tp/(tp+fn)
45 prec = tp/(tp+fp)
46
ZeroDivisionError: division by zero
What am I doing wrong? (I know it has to be solved by vectorisation, I was just trying to solve it with a for loop)