Unable to understand tensorflow class assigment error

Massimo_FANTIN · December 1, 2022, 10:32am

hi everybody i am doing the programming exercise about tensorflow (Tensorflow_introduction) but i am blocked on this function:

def compute_total_loss(logits, labels)

i am computing cost function in the following manner but it seems there is an error when i run the test compute_total_loss_test(target, Y):
the test calculates 0,81028707
my function 0.8071431

my cost function is the following:

total_loss = tf.reduce_sum(tf.keras.losses.categorical_crossentropy(tf.transpose(labels), tf.transpose(logits)))/logits.shape[1]

may someone help me to understand the error?

thanks

kchong37 · December 1, 2022, 1:14pm

Hi, here is a thread about this question that might be helpful.

Massimo_FANTIN · December 1, 2022, 1:15pm

denghiu now i will give a lookk

Massimo_FANTIN · December 1, 2022, 1:23pm

my function seems correct but results is not as expected by the test function

paulinpaloalto · December 1, 2022, 4:20pm

Note that the cost here is defined as the total sum of the loss values, not the mean, as it is in some other places. In the first code you showed, you are manually dividing by the number of samples. If you wanted the mean there is also a reduce_mean function in TF which would make that easier, but the real point is that is not what they are asking us to do here.

The other thing that was missing from your original code shown above is that you need to use the from_logits argument to tell the loss function to apply the softmax activation internally.

Massimo_FANTIN · December 1, 2022, 4:49pm

ok thank you i will try but isn’t specified anyway i have to use softmax

paulinpaloalto · December 1, 2022, 4:55pm

Well, maybe they don’t explicitly say that, but the lectures have been talking about that, right? The test case is doing a multiclass classification with 6 classes, right? So sigmoid is not going to work and softmax is your only hope. Notice that the forward propagation function that you defined earlier does not apply an activation function at the output layer, but just gives the linear outputs (“logits”).

That is the way Prof Ng always does it once we switch to using TensorFlow: it is better to let the loss function include the activation calculation internally as part of the loss calculation.

Massimo_FANTIN · December 1, 2022, 5:15pm

ok if i use from_logit=True and i drop the division by numner_of_examples(/logits.shape[1]) is working … it was this sentence:
“you can sum the losses across many batches, and divide the sum by the total number of samples to get the cost value.” that makes me drop a mistake however thank you for your precious help

paulinpaloalto · December 1, 2022, 5:17pm

Yes, but what they are explaining there is that this cost function is just intended for each individual minibatch and then you compute the mean once you have completed the Epoch and have the total sum.

Massimo_FANTIN · December 1, 2022, 5:20pm

yes now is clear thank you

Topic		Replies	Views
Error in Course 2 Week 3 Exercise 6 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	784	December 24, 2022
Course 2, Week 3, compute_total_loss, failed with really close result Improving Deep Neural Networks: Hyperparameter tun week-module-3 , coursera-platform	6	837	February 1, 2024
Week 3 - compute_total_loss Incorrect Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	5768	August 13, 2023
Course 2 Week 3 Exercise 6 Question Improving Deep Neural Networks: Hyperparameter tun coursera-platform	10	701	October 6, 2023
Week 3 assignment - Tensorflow Cost Function Computation Error Improving Deep Neural Networks: Hyperparameter tun coursera-platform	13	1257	March 10, 2025

Unable to understand tensorflow class assigment error

Related topics