Why do we not use hyperbolic tan function?

Tevin_Abeysekera · September 4, 2023, 6:53pm

When doing the lab for week 3, I realized that sigmoid function: 1 over (1 + e^-x) is nearly the same as adding one to the hyperbolic tangent function and dividing it all by 2. I put both functions into desmos and it’s nearly the same thing.

TMosh · September 4, 2023, 7:08pm

tanh() is used in some situation - such as when the output is a real number (instead of a classification).

The gradients of sigmoid() are slightly easier to compute, mathematically.

Tevin_Abeysekera · September 4, 2023, 7:37pm

Oh, okay!

Is this the same reason there is an e in the denominator of the sigmoid function instead of any other number since the derivative is easier to get?

TMosh · September 4, 2023, 7:46pm

One characteristic of the sigmoid() function is that it’s partial derivative is very easy to compute.

paulinpaloalto · September 4, 2023, 7:51pm

Also note that tanh and sigmoid are very closely related mathematically. The primary reason to choose one over the other is what you need the range of the function to be: for the output of a binary classifier, you need (0,1), but for a hidden layer in a network, you may find the range (-1,1) gives better convergence. Or not. There is no “one size fits all” solution for hidden layer activations.

Tevin_Abeysekera · September 4, 2023, 8:10pm

Thanks for the link

Topic		Replies	Views
Why not use tanh-func for output a^L? Neural Networks and Deep Learning coursera-platform	1	512	August 5, 2021
Is Tanh better than sigmoid? Neural Networks and Deep Learning coursera-platform	5	673	May 11, 2023
Course 1 : sigmoid vs tanh function Neural Networks and Deep Learning coursera-platform	2	652	August 23, 2021
Tanh and sigmoid are closely related Neural Networks and Deep Learning coursera-platform	3	875	March 3, 2022
Better Activation functions: (tanh > sigmoid) MLS Resources	18	1057	November 10, 2022

Why do we not use hyperbolic tan function?

Related topics