Why are there two different loss functions (a - y) vs. (y.log(a) + (1 - y)log(1 - a)), and when do we use them?? See the image clipping below.
Thanks!!
Why are there two different loss functions (a - y) vs. (y.log(a) + (1 - y)log(1 - a)), and when do we use them?? See the image clipping below.
Thanks!!
Hi, @Billu, and welcome to the Specialization.
Looking at your intro paragraph first, please note that a-y is not a loss function. You may have confused that with the derivative of the cross-entropy loss function (which follows the vs) with respect to z.
The two loss functions in the clipping are the mean square error (MSE) loss function and the cross-entropy loss, respectively. The first is fine for linear regression, but not the logistic regression. MSE loss for linear regression is nicely convex, i.e. it has a unique minimum.
But for binary classification using logistic regression, MSE loss is not convex and can have a number of local minima. By contrast, the cross-entropy loss function has a unique global minimum.
Thanks for your response kenb! I guess I really meant to say that I was confused about why we practiced L1 loss in the Python Basics practice exercise in Wk2 if we had determined that L1 loss is not convex
But you’ve resolved my confusion.
In Neural networks and deep learning week 2 assignment where we are to implement the loss using the log function I need some help. The labels Y are in an array with each of the m elements containing 1 or 0. Are we expected to compute the loss for each of the m cases individually given that the calculation expressions are different
For Y=0 (-log(1-yhati) or Y=1 (-logYhati).
No. There is a combined formula and we implement this. See the below formula:
J = -\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(y_{hat}^{(i)})+(1-y^{(i)})\log(1-y_{hat}^{(i)}))