Weighted Loss in Multi-class Classification

Hello everyone,

I have been watching the “Multi-class loss” video in the first week. In this video, the following loss function is demonstrated for multi-class classification. This loss function also covers the Class Imbalance problem.

But I didn’t really understand what the Wp,mass and Wn,mass terms are in this equation.
The lecturer mentions: “Positive labels associated with that particular class” and “The negative labels associated with that particular task”.

How these two terms are calculated?

Thank you so much for taking the time and reading this question,
Sadra

Hi @Sadra_Hakim,

W_{p(n), mass} is the coefficient that weighs the loss function values dependent on each class and label (positive or negative).

The model is trained in the direction of smaller loss.
However, when the rates of numbers of data of each class are different, we encounter a problem of this thread topic.

The problem is that the contribution of each class differs, i.e., the majority-class data contributes larger to the loss.
As a result, training for the minority-class data often fails.

As one of the solutions, we set the weight coefficient(W_{p(n), mass}) of each class to the loss function, making the contribution to the loss for each label equal.

In the lecture, the weighting factor values are set based on the rate of data for each class and label (positive or negative).

Best regards,
Nakamura

Hi @nakamura,

Thank you for your clear explanation.

Just one more thing. Does the W_{neg, mass} term mean that we should count the number of mass items that are detected as positive by the model? In other words:

W_{neg, mass} = (Number of mass examples that are detected as positive) / (Total number of mass examples)

Is this equation right?

Regards,
Sadra

Hi @Sadra_Hakim ,

The above my explanation may have been a bit confusing.
The Weighted coefficients are calculated based on the actual labels in the supervised data.
In the lecture, for each class, it is calculated from the rate of positive and negative results.

In the case where the class is mass, the weighted coefficients are as follows.
W_{pos, mass} = (Number of negative examples whose class is mass) / (Total number of mass examples)
W_{neg, mass} = (Number of positive examples whose class is mass) / (Total number of mass examples)

Best regards,
Nakamura

1 Like

Hi @Sadra_Hakim

The Wp,mass and Wn,mass terms in the loss function are used to address the class imbalance problem in multi-class classification.

Wp,mass is the weight assigned to positive labels (or true labels) associated with a particular class. It represents the importance or relevance of correctly identifying positive examples of that class.

Wn,mass is the weight assigned to negative labels (or false labels) associated with a particular class. It represents the importance or relevance of correctly identifying negative examples of that class.

These weights can be calculated in different ways, depending on the specific application and the desired behavior of the model. Here are a few examples:

  • One common method is to use the inverse of the class frequencies as the weights. This means that classes with fewer examples will have higher weights, and classes with more examples will have lower weights.
  • Another method is to use a fixed weight for all classes, or to use a weight that is proportional to a measure of the class imbalance.
  • Another way to assign the weight is to use the cost-sensitive learning approach. This approach is used in real-world scenarios where misclassifying a sample of some class would result in a higher cost than misclassifying a sample of another class. The weight is assigned based on the cost of misclassifying the sample.

In all of the above examples, the weights are a hyperparameter that can be tuned based on the performance of the model on the validation set.

Regards

Muhammad John Abbas

2 Likes

I am unable to pass the tests for this task. Here is my code. Please help!
for i in range(len(pos_weights)):
# for each class, add average weighted loss for that class
y_true = np.array(y_true).astype(np.float32)
pos_loss = -1 * np.sum(pos_weights[i]y_true[:,i] * np.log(y_pred[:,i]))
neg_loss = -1 * np.sum(neg_weights[i]
(1-y_true[:,i]) * np.log((1-y_pred[:,i])))
tot_loss = (pos_loss + neg_loss)
loss += tot_loss
return np.float32(loss)

Dear @anisha_balani,

Thanks for sharing your problem with us. Since you have given a context on which function you are having a problem with, I assume you have a problem with weighted_loss(y_true, y_pred) function.

As I see, you did not add the epsilon scalar to your logarithm. You have to add an epsilon to the input of your logarithm so it does not raise an error. Also, check out if you have to use np.sum or np.mean.

Let me know if your problem has been resolved.

Sincerely,
Yousef Sadegheih

I changed my code to


I am getting the expected output but still not able to pass the tests

Earlier when I was trying to run the tests I was getting error that np.float32 does not have any eval attribute so I removed the eval from public_tests.py
Screenshot 2023-03-21 at 10.23.02 PM

Dear @anisha_balani,

It would be helpful if you could share your code by messaging me privately.

Sincerely,
Yousef Sadegheih

Hello @Yousef_Sadegheih I am wondering if there is any place epsilon was ever mentioned in multi task, weighted loss function calculation in the course videos… Hope it was mentioned.