Decision trees entropy and logistic regression loss

newtonian · July 13, 2023, 4:31am

Prof Andrew Ng mentions that there is a good mathematical reason for why logistic regression loss formula has a form similar to the entropy formula for Decision Trees. Can someone please explain that reasoning or point me to good resources to understand the mathematical derivations?

Thanks.

TMosh · July 13, 2023, 9:37pm

This presentation might be helpful.

newtonian · July 14, 2023, 12:15am

Thank you @TMosh for the link. It does make clear with a few examples how information gain and entropy are used in Decision Trees. What is not clear to me yet is why the loss function for logistic regression takes a similar form. For one, the curve has one maxima instead of one minima. How would gradient descent find a minimum for such a curve? (Unless the logistic loss curve is inverted compared to the entropy curve)

TMosh · July 14, 2023, 12:35am

Maximum and minimum are only algebraic differences. You can convert between convex and concave curves easily, by subtracting from 1, or multiplying by -1, depending on the situation.

The purpose of the log() function is to create exponentially larger penalties as the errors increase.

Topic		Replies	Views
A simplier and different logistic loss function? Supervised ML: Regression and Classification week-3	2	522	December 31, 2022
Suggestion for introduction of the loss function in "Logistic Regression" Neural Networks and Deep Learning week-2	2	14	January 8, 2025
Entropy function and logistic lost Advanced Learning Algorithms week-4	3	608	October 30, 2022
Confusion about the concepts of entropy and information gain in Decision Tree! Advanced Learning Algorithms week-4	1	340	September 2, 2024
Logistic Regression Cost Function Intuition start around 3:24 Neural Networks and Deep Learning week-2	3	252	March 25, 2024

Decision trees entropy and logistic regression loss

Related topics