Sigmoid activation function issues

paulinpaloalto · May 7, 2024, 4:06pm

Hi, Natalie.

That’s great news that you found the solution. Note that the way you wrote that prediction code would actually be correct if you wrote the network the way Prof Ng usually does, which is to omit the sigmoid (or softmax as appropriate) at the output layer and then use the from_logits = True mode of the cross entropy loss function (all versions of cross entropy support that). That’s because in that case, you actually do end up with the prediction outputs being “logits”, meaning the pre-sigmoid values in the range (-\infty, \infty) and where > 0 means “true”.

The reason it is common to do it that way is explained on this thread. But it does then make it a bit more of a hassle to use the prediction values, because you need to manually apply sigmoid (or softmax). Your “lambda” function implementation would be a nice way to solve that problem, if you were in “logits” mode.

Best regards,
Paul

Topic		Replies	Views
Real world scenario using sigmoid as an activation function Advanced Learning Algorithms week-module-1	1	613	July 10, 2022
Activation functions in the hidden layers Advanced Learning Algorithms week-module-2	4	510	July 21, 2022
Activation Function for Last Layer - Lab Assignment: Neural Networks for Binary Classification Advanced Learning Algorithms week-module-1	2	518	August 1, 2023
Softmax as activation function for output layer of DenseNet121 model AI for Medical Diagnosis week-module-1	6	493	August 13, 2023
First binary classification model Neural Networks and Deep Learning coursera-platform	5	568	July 12, 2022

Sigmoid activation function issues

Related topics