Why is sigmoid activation function better for binary classification than the tanh activation function

sujit-ahirrao · September 20, 2021, 8:15pm

Sigmoid gives a value between 0 and 1 and we can use threshold of 0.5 to round off the output to 0 and 1 which makes a good binary classifier.

But, isn’t it the same case with tanh which gives a value between -1 and 1?
We can use 0 as threshold to round off the output to -1 and 1.

In one of the justifications to the answer in the Quiz of Week 3, it says – Tanh is less convenient as the output is between -1 and 1. I don’t understand how?!

paulinpaloalto · September 21, 2021, 12:33am

What cost function would you use if tanh is your output activation? The cross entropy log loss function that we use will not handle outputs other than in the range 0 to 1.

If your response is, well we could shift and scale tanh to have the range (0,1), then guess what? It turns out tanh and sigmoid are very closely related mathematically, so you don’t really gain any advantage by that strategy.

sujit-ahirrao · September 21, 2021, 3:32am

Thanks for the detailed answer!

Topic		Replies	Views
Is Tanh better than sigmoid? Neural Networks and Deep Learning	5	670	May 11, 2023
Better Activation functions: (tanh > sigmoid) MLS Resources	18	1030	November 10, 2022
Using tanh vs. sigmoid for output layer Neural Networks and Deep Learning	6	776	October 20, 2022
Question about c1w3 quiz Neural Networks and Deep Learning	2	697	October 30, 2021
First binary classification model Neural Networks and Deep Learning	5	554	July 12, 2022

Why is sigmoid activation function better for binary classification than the tanh activation function

Related topics