Hi,
Why we didn’t use sigmoid function while training model in the assignment?
However we have just calculated Z3 and passed it to the function that calculates cost.
Please help
We did, but the way it happens is by using the from_logits = True parameter to the loss function to tell it to do both the activation and loss computations together. That is both more efficient and more numerically stable. For example, it’s easier to handle things like saturated sigmoid values which would normally cause NaN results.
Well, note that this is a multiclass case, so it’s not sigmoid, but softmax as the activation. That’s why the loss function is “categorical” cross entropy instead of “binary” cross entropy. But the same mechanism applies in both cases.
They give you a link to the documentation for the cost function in the instructions. It might be worth actually reading it with the above description in mind. What you will find from this point forward is that Prof Ng always uses this method.
well, note that this is a multiclass case, so it’s not sigmoid, but softmax as the activation. That’s why the loss function is “categorical” cross entropy instead of “binary” cross entropy. But the same mechanism applies in both cases.
Yes, that is a quote from my previous reply. Do you have an additional point or question?