hi everybody i am doing the programming exercise about tensorflow (Tensorflow_introduction) but i am blocked on this function:
def compute_total_loss(logits, labels)
i am computing cost function in the following manner but it seems there is an error when i run the test compute_total_loss_test(target, Y):
the test calculates 0,81028707
my function 0.8071431
Note that the cost here is defined as the total sum of the loss values, not the mean, as it is in some other places. In the first code you showed, you are manually dividing by the number of samples. If you wanted the mean there is also a reduce_mean function in TF which would make that easier, but the real point is that is not what they are asking us to do here.
The other thing that was missing from your original code shown above is that you need to use the from_logits argument to tell the loss function to apply the softmax activation internally.
Well, maybe they don’t explicitly say that, but the lectures have been talking about that, right? The test case is doing a multiclass classification with 6 classes, right? So sigmoid is not going to work and softmax is your only hope. Notice that the forward propagation function that you defined earlier does not apply an activation function at the output layer, but just gives the linear outputs (“logits”).
That is the way Prof Ng always does it once we switch to using TensorFlow: it is better to let the loss function include the activation calculation internally as part of the loss calculation.
ok if i use from_logit=True and i drop the division by numner_of_examples(/logits.shape[1]) is working … it was this sentence:
“you can sum the losses across many batches, and divide the sum by the total number of samples to get the cost value.” that makes me drop a mistake however thank you for your precious help
Yes, but what they are explaining there is that this cost function is just intended for each individual minibatch and then you compute the mean once you have completed the Epoch and have the total sum.