Real world scenario using sigmoid as an activation function

Kosmetsas_Tilemahos · July 10, 2022, 5:24pm

Based on the line that states, that the sigmoid is best for on/off binary situations I want to ask the following:

Let’s consider a scenario where all of our input features are binary (e.g. they are all signals 0->negative , 1-> positive from diferent sampling techniques).
Should we then use the sigmoid for all layers ( alternatively just for the first and last layer) ?

The output of the network should also be binary in my example (e.g. purchase or not purchase)

I could elaborate more if I am not understood.

rmwkwok · July 10, 2022, 11:26pm

Hello Kosmetsas,

If you want your output to be between 0 and 1, it’s a sufficient argument to use sigmoid at the last layer (this echoes the statement “The sigmoid is best for on/off or binary situation”). If you want to transform your input to be between 0 and 1, it’s sufficient to add sigmoid right after your input layer, however, it’s better to do it as a feature engineering step because you will then only need to do the sigmoid transformation once.

If your features are already binary, being binary itself, in my opinion, isn’t sufficient for us to use sigmoid in any of the layers. Otherwise, we would just min-max normalize any continuous features into ranging between 0 and 1, and stick with sigmoid forever and perhaps we don’t need to invent ReLU or other activations.

Bringing in non-linearity is an important reason for us to use ReLU or sigmoid, or other activation functions other than the linear activation.

Bringing in ReLU has its significance and I compared ReLU and sigmoid in this thread, including a reference to Professor Ng’s DLS video on Activation function. Let me know if any of the points there needs more clarification.

Raymond

Topic		Replies	Views
Sigmoid activation function issues AI Discussions ai-discussions , ai-question	24	1019	May 7, 2024
Scaling after layer with relu/linear activation Advanced Learning Algorithms week-2	2	492	February 23, 2023
Cost Function when output layer has activation other than sigmoid Neural Networks and Deep Learning coursera-platform	7	622	March 24, 2022
ReLu activation function Vs sigmoid function Neural Networks and Deep Learning coursera-platform	2	556	June 15, 2022
Activation functions in the hidden layers Advanced Learning Algorithms week-2	4	510	July 21, 2022

Real world scenario using sigmoid as an activation function

Related topics