I was attending lecture related to activation functions. In introductory lecture example was given for hidden unit awareness. It was perceived that this unit is non-binary. As per the 1st course(Supervised Machine Learning) we use logistic regression for classification problems. Binary output is one of the traits of classification. So if the awareness unit is not non-binary, can it be called as a case of linear regression? If yes then instead of using a logistic regression for this unit why can’t we use linear regression for it?
So instead of “Linear Activation function” why can’t we use linear regression?
That’s just a label that the instructor put on that unit to indicate what it might be learning.
You can ignore that - it’s just a bit of intuition that the instructor likes to include in the lectures. Mathematically every unit uses the same process - it’s just learning some weights that help minimize the cost.
Hidden-layer units in a neural network must have a non-linear function. Typically that’s sigmoid, tanh, or ReLU.
At the output layer, sigmoid is used for classification because of how the cost function is defined. We’re not trying to model the data (as is done in linear regression). Instead the classification cost function is trying to create a boundary that separates the output classes. Mathematically they are fundamentally different.