Week 3 - Exercise 6 - Compute Total Loss

Hi, I have tried different versions of the implementation of binary_crossentropy, and this is the closest answer I’ve got. I can’t get the correct answer. I’ve even tried transposing as suggested on the forum, but I still couldn’t get the right answer. Please help. Thank You!

Hi @Syed_Hamza_Tehseen ,

Please attach the function in a direct message to me, I will have a look for you.

Hi @Syed_Hamza_Tehseen ,

The problem is because the wrongTF function is used and also, both the labels and logits have not been transposed.

Please see the implementation instructions for ex6.

tf.reduce_sum() is to be used for summing all the examples. But your code used tf.reduce_mean()

Yes, it worked now; thank you. By the way, I couldn’t find any instruction regarding transposing labels and logits in the instruction section of Exercise 6. Can you help me find it? Thank You

Hi @Syed_Hamza_Tehseen

Here is one of the points listed in the implementation instruction:

If you read the comments on the input arguments:

Arguments:
logits -- output of forward propagation (output of the last LINEAR unit), of shape (6, num_examples)
labels -- "true" labels vector, same shape as Z3

It is because the logits and labels are not in the shape that is expected by the categorical_crossentropy, so, in order for it to work correctly, we need to transpose logits and labels.

3 Likes

Oh right. Thanks for the assistance

Hello Kic,

As there is no indication of the required data shape in the documentation tf.keras.losses.categorical_crossentropy, would you please advise how to check the data shape required? Thanks.

Hello @daniel159357,

This one mentioned the shape requirement.

Cheers,
Raymond

1 Like

Hi @daniel159357 ,

This is an extract from the reference menu that @rmwkwok is referring to:

Use this crossentropy loss function when there are two or more label classes. We expect labels to be provided in a one_hot representation.
There should be num_classes floating point values per feature, i.e., the shape of both y_pred and y_true are [batch_size, num_classes] .