Why "logit" stands for the output of linear activation function

Yulin_Li · September 5, 2022, 3:20am

When exploring the optional lab “Softmax function”, I found a statement like this: In the preferred organization the final layer has a linear activation, and for historical reasons, the outputs in this form are referred to as logits .

Then I googled the word “logit” to understand it further, and the explanation is that logit model is equivalent to logistic model.

So I’m a little confused. Why the “logit” stands for the output of a linear activation function rather than a sigmoid activation function?

TMosh · September 5, 2022, 4:18am

The “from logits” method takes a linear output and applies logistic softmax detection automatically

rmwkwok · September 5, 2022, 5:37am

If you would like to read some maths, you might check out this discussion.

Yulin_Li · September 5, 2022, 12:02pm

Thanks, @TMosh and @rmwkwok .

The statement “logistic softmax automatic detection” and the math about how to deal with excessive exponential inspired me a lot.

But I’m still a little confused about the origin of the name of logit. I just don’t understand the naming logic. Why don’t we just use “from_linear” rather than “from_logits” to represent that the input to the loss function is just a linear activation? Or is there something I don’t understand clearly?

rmwkwok · September 5, 2022, 12:27pm

Hello @Yulin_Li,

I think logits can be defined as the inverse of sigmoid:

logit(p) = \log{(\frac{p}{1-p})}
\implies \exp{(logit(p))} =\frac{p}{1-p}
\implies p = \frac{1}{1+\exp{(-logit(p))}}
\implies p = \frac{1}{1+\exp{(-z)}}
\implies p = sigmoid(z)

where z = logit(p) = linear(a) and linear refers to the linear activation. Therefore, z is called the logit.

Raymond

Yulin_Li · September 5, 2022, 3:11pm

Hello @rmwkwok ,

Great thanks to you!

These formulas solve my doubts very well. And I’ll set this as the solution.

rmwkwok · September 5, 2022, 4:31pm

You are welcome Yulin!

farhana_hossain · December 17, 2023, 3:14pm

So, from_logits=True will extract the value of p,

which is sigmoid(z), right?

rmwkwok · December 18, 2023, 12:42am

Setting from_logits=True for a tensorflow loss function means that the function expects for z. It does not extract. Please google for how people use it, and practice it yourself.

p is sigmoid(z). @farhana_hossain , please be explicit about the subject of your statement too, next time.

Cheers,
Raymond

farhana_hossain · December 18, 2023, 11:26am

Yes, I got it at first.

I will make sure this thing

Topic		Replies	Views
Question about is_logit Advanced Learning Algorithms week-module-2	30	940	February 17, 2024
Improved implementation of softmax - Neural network training \| Coursera Advanced Learning Algorithms week-module-2	1	68	June 25, 2024
Week 2 - Improved implementation with SoftMax Advanced Learning Algorithms week-module-2	10	715	December 1, 2023
Practice quiz: Multiclass Classification Advanced Learning Algorithms week-module-2	1	537	June 18, 2022
Numerical correct implementation of softmax Advanced Learning Algorithms week-module-2	6	614	December 24, 2022

Why "logit" stands for the output of linear activation function

Related topics