I’m not sure how we could have gotten to the following partial derivative at 3:56 in the lecture for “Classification with Perceptron - Calculating the derivatives”.
The problem is with the signs as I’ve marked in red. I don’t see how we can get to the plus sign that’s between the two terms in the dL/dy_hat partial derivative as described in the video from L(y, y_hat).
Could someone help?
It’s the Chain Rule, right? If we have:
f(1- \hat{y}) = ln(1 - \hat{y})
Then:
f'(1 - \hat{y}) = \displaystyle \frac {1}{1 - \hat{y}} \frac {d}{d \hat{y}} (1 - \hat{y}) = \frac {1}{1 - \hat{y}} (-1) = - \frac {1}{1 - \hat{y}}
Thanks @paulinpaloalto. Could you possibly help me to understand, given the derivative of ln(x) is:

I don’t see the chain rule in there anywhere.
I learned that the derivative of ln(x) is just 1/x, and so I don’t understand why here we should also apply the chain rule.
The point is that the argument to ln is not x, right? It’s (1 - \hat{y}). So you could write this as:
f(x) = ln(x)
g(x) = (1 - x)
h(x) = f(g(x)) = ln(1 - x)
That’s the point. It’s a composite function, which is why the Chain Rule applies. That gives us this:
h'(x) = f'(g(x)) g'(x) = \displaystyle \frac {1}{1 - x} g'(x) = \displaystyle \frac {1}{1 - x} (-1)
1 Like
Ok thank you, makes sense.