Real world scenario using sigmoid as an activation function

rmwkwok · July 10, 2022, 11:26pm

Hello Kosmetsas,

If you want your output to be between 0 and 1, it’s a sufficient argument to use sigmoid at the last layer (this echoes the statement “The sigmoid is best for on/off or binary situation”). If you want to transform your input to be between 0 and 1, it’s sufficient to add sigmoid right after your input layer, however, it’s better to do it as a feature engineering step because you will then only need to do the sigmoid transformation once.

If your features are already binary, being binary itself, in my opinion, isn’t sufficient for us to use sigmoid in any of the layers. Otherwise, we would just min-max normalize any continuous features into ranging between 0 and 1, and stick with sigmoid forever and perhaps we don’t need to invent ReLU or other activations.

Bringing in non-linearity is an important reason for us to use ReLU or sigmoid, or other activation functions other than the linear activation.

Bringing in ReLU has its significance and I compared ReLU and sigmoid in this thread, including a reference to Professor Ng’s DLS video on Activation function. Let me know if any of the points there needs more clarification.

Raymond

Topic		Replies	Views
Assignment 2 - Dense Layer Activation Convolutional Neural Networks week-module-2 , coursera-platform	2	355	January 19, 2024
What if the last layer is not sigmoid? AI Discussions	4	82	December 2, 2023
Why Activation function in last layer - linear - C4W2 Convolutional Neural Networks coursera-platform	1	532	January 16, 2022
Sigmoid activation function issues AI Discussions ai-discussions , ai-question	24	1151	May 7, 2024
ReLU function as activation function Advanced Learning Algorithms week-module-2	3	422	July 11, 2023

Real world scenario using sigmoid as an activation function

Related topics