Purpose of using numerically accurate implementation of softmax

rmwkwok · March 4, 2024, 5:02am

Yup, @Bio_J, some of the very small round off error, and, when -z becomes large, the overflow problem in e^{-z}.

I said some of the very small round off error, because, as I explained in this post, the “numerically accurate implementation” gives us a mathematical simplification from

to

And we still have one exponential term left.

Cheers,
Raymond

Topic		Replies	Views
Tensorflow avoid /accurate? Advanced Learning Algorithms week-module-2	1	346	December 18, 2023
What exactly does the improved implementation of softmax video mean? Advanced Learning Algorithms week-module-2	9	832	August 18, 2023
https://www.coursera.org/learn/advanced-learning-algorithms/lecture/Tyil1/improved-implementation-of-softmax Advanced Learning Algorithms week-module-2	1	49	June 30, 2024
Deep Learning specialization course softmax implementation has `z - np.max(z)`. Why? Sequence Models week-module-1	9	124	September 12, 2025
Improved Implementation of Softmax - Trouble Understanding the Logic Advanced Learning Algorithms week-module-2	4	51	August 18, 2024

Purpose of using numerically accurate implementation of softmax

Related topics