Correct way to create a logistic regression NN

Amit_Misra1 · September 16, 2023, 11:51am

Would it be correct that the following implementation of logistic regression is better than just specifying “sigmoid” as the activation function for the output layer as Dr. Ng said?

Amit_Misra1 · September 16, 2023, 11:52am

If so, are there any situations where using the Sigmoid activation function would be advantageous?

TMosh · September 16, 2023, 3:48pm

It depends on what you mean by “better”.

Andrew recommends using a linear output with “from_logits = True” when you have multiple labels, because it does a couple of things.

Automatically applies softmax.
It has some mathematical or computational efficiencies.

Yes. For one, when you have only two labels or a true/false result.

paulinpaloalto · September 16, 2023, 7:55pm

One other point to make here: just to be accurate, the network you have implemented is not Logistic Regression. It is a Fully Connected network with 3 layers which does binary classification. Logistic Regression is essentially a trivial Neural Network with only the “output” layer and does binary classification.

I would also state the case differently: you’re using “sigmoid” at the output layer either way. It’s just a question of whether you explicitly include the “sigmoid” activation or whether you let it be handled internally within the cross entropy loss function (the from_logits = True mode). Of course if you don’t explicitly add the “sigmoid” in the output layer, then you also have to add it explicitly in your “predict” logic.

So maybe you could argue that the one case in which explicitly adding “sigmoid” in the output layer is better is that it makes your predict logic simpler, if that goal is more important to you than the improved numerical accuracy gained by the other method. You could also try it both ways in a given case to see if the predictions are actually affected by any accuracy differences. It’s possible that in any given case it ends up not mattering that much to the results of training.

Here’s a thread which discusses why the from_logits = True method is preferred. Here’s a thread from Raymond that goes into some depth in showing why the latter method is more accurate.

Amit_Misra1 · September 17, 2023, 9:25pm

Thanks Paul. Is it correct to say that if performing binary classification, sigmoid needs to be specified in the output layer or the predict line of code? And utilizing the latter with a linear output layer, it’s more numerically stable?

TMosh · September 17, 2023, 10:25pm

In the general simple case, yes to both.

More stable than what?

The stability argument only applies for TensorFlow and an NN and the situation where you might think about using softmax() in the output layer.

Topic		Replies	Views
Week 2 - Improved implementation with SoftMax Advanced Learning Algorithms week-2	10	709	December 1, 2023
Including the sigmoid activation in the final layer is not considered best practice. It would instead be accounted for in the loss which improves numerical stability. This will be described in more detail in a later lab Advanced Learning Algorithms week-1	2	560	January 20, 2023
Improved implementation of softmax - Neural network training \| Coursera Advanced Learning Algorithms week-2	1	67	June 25, 2024
Week 2, prog_assgn, Ex-2 Convolutional Neural Networks	5	529	October 25, 2021
Assignment contains misguidance Convolutional Neural Networks week-2	3	15	January 1, 2025

Correct way to create a logistic regression NN

Related topics