DLS 2 week 3 exercise 6 compute_cost

In the compute cost function I used
cost = tf.reduce_mean(tf.keras.losses.categorical_crossentropy(labels, logits))

but i am getting error. Can someone help me to find where I made a mistake.


Hello @sandra_jayan,

Notice please the shape of the arguments logits and labels as received by the compute_cost function and the shape expected by tf.keras.losses.categorical_crossentropy, maybe you need to adjust that. Also be aware of the from_logits parameter needed in the call to the categorical_crossentropy function.


I have tried to reshape it but it didn’t work at all.
I have even tried changing from_logits = True.

logits = tf.cast(tf.reshape(logits,(logits.shape[1],logits.shape[0])),dtype=tf.float64)
labels = tf.cast(tf.reshape(labels,(labels.shape[1],labels.shape[0])),dtype=tf.float64)
cost = tf.reduce_mean(tf.keras.losses.categorical_crossentropy( y_true=labels, y_pred=logits))

Hi @muhammadahmad,

According to function compute_cost documentation you are receiving arguments with shape (num_classes, num_examples), but you need to pass tensors of shape (num_examples, num_classes) to tf.keras.losses.categorical_crossentropy.

Notice that reshape gives you the correct dimensions but does not transpose rows and columns as needed in this case. Maybe you should try another tensorflow function specific for that operation.

Hope that helps.



Thanks it worked after taking transpose.

But the expected output mentioned in the assignment is not correct.


Thanks a lot it worked after adding from_logits. Also I had forgotton to take transpose in the above code.

All the excercise passed but I got 80 only. I couldn’t find where I did wrong.


Also, be careful about order of categorical_crossentropy parameters. The first should be true labels, predictions are second. I spend 2 desperate hours before I noticed this :slight_smile:


What does it mean by adding “from_logits = True” here? I know that it works but not sure why. Thank you for your help!

Hi @Nanyin,

from_logits=True indicates that y_pred is not normalized (i.e. does not come from a softmax).
If you keep the default option from_logits=False, the function assumes that y_pred is coming as a probability distribution.

Check the link in this post, for the documentation of categorical_crossentropy function.


My god thank you. I’ve been crazy about half an hour because my outpot is rediculously larger than 100 and I’m just wondering why… Really helpful :dizzy_face:


Thanks very much.
I added from_logits=True, and also transposed the 2 input parameters, labels, logits at first, and got passed.


I have used all of the recommendations - transposing, order of arguments, from_logits=True, take crossentropy then reduce_mean but I am getting a strange error shown below. Any suggestions?

FIXED: I had ran all of the cells above singly when working on assignment but decided to use the “Run All Above” in the Cell menu when in the ‘grading cell’ to run them all again and that fixed the problem.

NameError Traceback (most recent call last)
17 print("\033[92mAll test passed")
—> 19 compute_cost_test(compute_cost, new_y_train )

NameError: name ‘new_y_train’ is not defined

Hi Tony,

Welcome to the community!

Probably, you have missed running a cell that takes the value of new_y_train. What I suggest is, save your work–> go to kernel—> restart and clear all outputs–> then again save your work—> run all the cells from the beginning through shift+enter key.

What happens when you miss running a cell, it doesn’t carry that particular output further and throws error later.

Happy Learning!

Makes no sense that the importance of the “from_logits” is not even mentioned i.e. the categorical cross enthropy is not even mentioned in the lectures.


It is covered in ML Specialization - C2-W2

new_y_test = y_test.map(one_hot_matrix)
new_y_train = y_train.map(one_hot_matrix)

This cell doesn’t run successful, it with errors that result in the error in the compute cost, Any help on this, please

After running the cell
new_y_test = y_test.map(one_hot_matrix)
new_y_train = y_train.map(one_hot_matrix)

I get an error, Please help

This worked thank you! I was racking my brain for 30 mins!


But that is not in this course.

