Problem with Course 2 Week 3 Assignment

Hello! I have a question about Exercise 6 on the Course 2 of Week 3. It would be great if you can share some knowledge with me, it will be much appreciated:

I used this line of code but it only yielded similar results:
total_loss = tf.reduce_sum(tf.keras.losses.categorical_crossentropy(labels,logits))

It resulted in the following:
AssertionError: Test does not match. Did you get the reduce sum of your loss functions?

I guess that I’m unfamiliar with the usage of tf.keras.losses.categorical_crossentropy( ), please give me a hint!

Thank you.

Here are a few things to look for when it comes to categorical_crossentropy:

  1. from_logits flag.
  2. Shape of y_pred and y_true. Both of them should be of shape (num examples, num classes)
1 Like

Here’s a thread that goes into more detail about the from_logits parameter and what that means.

1 Like

Thank you giving the time for replying, much appreciated! But I still haven’t solved the problem:

  1. Should I set it to True? From the document it has been shown that the default is False (post activation), therefore I assume I should set to False.
    2… I have checked the shapes, I have 6 classes and 2 examples, therefore it should be (2, 6) , yet it yielded the wrong result. ( I set the from_logits = False)

Thank you again!

Thank you for reading my message! It is highly appreciaed!

Update: I have set the parameter to True and it passed, yet I still don’t get the idea, Can you help me understand it more intuitively?

Notice that there is no activation function on the output layer for our forward propagation logic here. So the output is “logits”, meaning linear outputs, not activation outputs. That is why you need “True” as the value of from_logits to tell the cost function it is getting logits and needs to internally apply the activation function.

The other thread that I linked earlier also talks about this in some detail and explains the mathematical reasons why it is better to do it this way.

Just chiming in here, but I think it’s worth noting that just because the default value for an argument is false does not mean that false is the correct answer. Reading the documentation to learn what it means for that argument to be either true or false is cruicial.