I understand how to update the weights for a binary logistic regression problem, but Im wondering how this can be implemented or the correct way of updating the weights when the number of classes K >2.
Hi @Jose_Delgado ,
The update of weights of the hidden layers is the same for a binary classification and for a multi-class classification: it happens in the back propagation. The math is also the same.
May be the part that you want to understand is how to convert a 2-class classification to a multi-class classification?
If this is the question, then the answer is: in the last layer. In a binary classification you end up with a layer of 1 neuron that will be 0 or 1, or better a number between 0 and 1 and, depending on a threshold that you define, this will be 0 or 1. For instance, if you define that the threshold is 0.5 then
1: if yhat > 0.5
0: if yhat <= 0.5
So this is for a binary classification.
For a multi-class classification, your last layer will have k- or n- units. If your number of classes is, say, k = 10 (also can be notated n = 10), then you would create a last layer with 10 units, and you would set an activation called “softmax” to this class. When you do this, then your model will resolve a classification task for 10 classes.
But regardless of whether you finish your network with a layer of 1 unit or 10 units, the back propagation will be basically the same and it is in this process where your weights of every hidden layer are updated.
Please let me know if I interpreted your question properly. If not, please feel free to share your reaction.
Thanks,
Juan
Hey Juan! Thank you for your time in helping me out.
So is a mutli-class classification a neural net in nature? A binary classification can be a standalone process without having to make layers and such. But the same does not apply with a multi class since its several binary classification combined right?
Actually you can make a binary classification also with a deep neural network if you need it. In fact, you’ll soon learn about a special type of neural network that can be quite deep sometimes: the CNN, which classifies images. And again, it can be binary or multi-class. What determines that is the final layer.
And yes, you can also have a logistic regression for binary classification. Very simple yet very powerful for some cases.
Hi,
I want to add an additional angle on top of Juan’s answer.
If we zoom out to the whole family of modeling techniques, we can have decision tree to do a multi-class which is a proof that it is not a neural net in nature.
If we stick with the usual logistic regression formalism, we can also add one additional concept called “One-Vs-Rest” (that is a keyword you can google with) to build 5 logistic regression models for a task of 5-class classification. The idea is as simple as that the first model is for predicting the probability that a sample belongs to class 1, and the second model for class 2, and so forth and so on. Everytime we want to predict about a sample, it goes through all of the 5 models, and pick the class with the highest probability.
However, if I just focus on what you have said:
In a binary classification task, I think you have agreed that there is no need for hidden layer, just an input layer and an output layer, right?
For a multi-class task, we actually can also only have an input layer and an output layer. One difference about the architecture is that for the binary task, it needs a sigmoid activation at the output layer, whereas for the multi-class task, it needs a softmax. And if we think carefully about this 1-layer multi-class neural network, it is actually quite like the One-Vs-Rest approach (except for the softmax part).
Another difference is that, the binary task uses one neuron in the output layer, whereas the N-class task uses N neurons in the output layer.
Cheers,
Raymond