Hello, I dont understand the topic here.
When I use linear function in hidden layers and in output layer use the sigmoid function. In the lecture it is said that, the model cannot do logistic regression. Why?
Hi!
Are you referring to the Lecture Video: âWhy do we need activation functions?â
Yes! I am referring this video. Thanks!
Hello @ugqzg, welcome to our community!
I think Andrew didnât say the model cannot do logistic regression, instead I believe the idea was that a big neural network with linear activation for all hidden layers and sigmoid activation for the output layer is no different from just a logistic regression model, or a neural network with just one output layer with sigmoid activation and without any hidden layer.
We need non-linear activation between two hidden layers to let them be meaningfully separated as two layers. Two hidden layers with a linear activation in between is effectively just one hidden layer. The reason behind my statement was illustrated in the video with the maths equation at roughly the timestamp of 3:00.
Raymond
Hello, thanks for your answer!
Maybe the subtitles are wrong? In this video at roughly timestamp of 4:20-4:30, it is said that: " Or alternatively, if we were to still use a linear activation function for all the hidden layers, for these three hidden layers here, but we were to use a logistic activation function for the output layer, then it turns out you can show that this model becomes equivalent to logistic regression, and a4, in this case, can be expressed as 1 over 1 plus e to the negative wx plus b for some values of w and b. So this big neural network doesnât do anything that you canât also do with logistic regression."
That is what I dont understand. Thanks!
Hello @ugqzg,
The subtitle is correct. I am quoting from it:
it turns out you can show that this model becomes equivalent to logistic regression
this big neural network doesnât do anything that you canât also do with logistic regression.
These two lines talk about the same thing that without non-linear activation in the hidden layers, a big neural network is nothing more than just a simple logistic regression. Note the double negative in the second quote â⌠doesnât do anything that you canât also do âŚâ
Raymond
That means, that I can also do that? Thanks!
it means a neural network with linear activation for all hidden layers and sigmoid activation for the output layer is no different from just a logistic regression model.
OK, Thank you very much!
Hi Raymond,
What you were saying here is that it defeats the purpose of using a neural network. Rather than using a neural network with multi hidden layers (using linear activation) and one output layer, we might as well have just used a logistic regression model, correct?
Thank you
Christina
Exactly!
Raymond
Hi Raymond,
Further to my previous question, in the following video âImproved implementation of softmaxâ shown in below screenshot, even though the last layer now using linear activation (rather softmax), because hidden layers are using relu, the model still is still a softmax model(multi classification model) rather than becomes a linear regression model, am I taking this correctly?
Or, the reason this is not a linear regression because the predict (last line of the code) is still using softmax(logits)?
Thank you
Christina
Hello @Christina_Fan
The difference that ReLU makes is turning the model into non-linear. It does not matter to whether it is a multi-classification model or not.
It is a multi-classification when softmax is used (either by specifying âsoftmaxâ in the output layer, OR not specifying it in the output but enabling it in the loss function).
So, some modifications are needed to the following:
Cheers,
Raymond
Brilliant, thank you for the clarification Raymond, make much more sense now.
You are welcome
Cheers.