Exercise 3 - Weighted Loss

I’ve been stuck on exercise 3 for a while. How can I calculate the y_true(tensor) and y_pred(tensor)? I think I’m close, but I would appreciate an explanation.

Also, for loss +=, the average weighted losses for both classes need to be added together, right?

Hello @niheon :grinning:!
Here there is a function inside a function, Your work is only to fill in the inner function, there both y_true and y_pred is given as arguments, so use it for calculating weighted loss. While iterating use pos_weights, y_true and log(y_pred) and epsilon inside the mean function. It will work.

And for calculating the loss for each class use (i) instead of (0) for all arguments inside the mean function.

Here is a detailed explanation of the nested function concept.
When you look in the next cell, you can see that we are calling the outer function (get_weighted_loss) first and we are storing the returned inner function in a variable. And we are calling the inner function again with y_true and y_pred. This concept is called closure in python.

1 Like

Hi @bharathikannan, I tried various iterations/implementations (see below) and finally got equal losses, which equalled zero. However, I still didn’t get the expected output of L(y_pred_1)= -0.4956, L(y_pred_2)= -0.4956, which should equal zero. What about the neg_weights? Are they not included in the loss calculation? And for y_true and y_pred, do I need to convert them into tensors? See the implementations below

First Implementation
loss += -1 * K.mean(pos_weights[i] * y_true * K.log(y_pred + epsilon))
L(y_pred_1)= -0.3538, L(y_pred_2)= -0.1749
Difference is L1 - L2 = -0.1788

Second Implementation
loss += -1 * K.mean(pos_weights[i] * y_true * K.log(y_pred + epsilon) + (neg_weights[i] * (1-y_true) * K.log((1-y_pred) + epsilon)))
L(y_pred_1)= -0.5287, L(y_pred_2)= -0.5287
Difference is L1 - L2 = 0.0000

Third Implementation
y_true = labels_matrix
y_pred = np.ones(y_true.shape)
loss += -1 * K.mean(pos_weights[i] * y_true * K.log(y_pred + epsilon) + (neg_weights[i] * (1-y_true) * K.log((1-y_pred) + epsilon)))
L(y_pred_1)= -0.4621, L(y_pred_2)= -0.4621
Difference is L1 - L2 = 0.0000

Additional check (Expected Output)
If you implemented the function correctly, then if the epsilon for the get_weighted_loss is set to 1, the weighted losses will be as follows:

L(y_pred_1)= -0.4956, L(y_pred_2)= -0.4956
If you are missing something in your implementation, you will see a different set of losses for L1 and L2 (even though L1 and L2 will be the same).

Hi @niheon :grinning:!

Your first implementation is completely wrong as it should include neg_weights.

Your third implementation is also wrongs as you should not convert y_true as label_matrix.

However, your second implementation can be slightly modified to get the correct answer.

Here are the steps broken for you to understand it.

First the y_true and y_pred is 2_D array, so for each iteration in the for loop you should take each column of these arrays. Use corresponding i in [ : ] to take it.

You are taking mean along the column so add the corresponding axis either 0 or 1. Check online and add corresponding argument to the mean function.

Then check the brackets correctly, seperate pos_weights and neg_weights with a braket and add them together.

All together the structure should be

  • -K.mean((p__ ) + (n__ ),axis = )

I have given you all the hints, Hope this clears all your doubts :grinning:!


Can you provide the equation of the loss function to better understand your query? The loss function requires both the y_true and y_pred. Generally you do not alter the shape of y_true and y_pred

Hi @sbansal793 I figured it out. Thanks.

1 Like