# Week 3 - Exercise 6 - Compute Total Loss

Hi, I have tried different versions of the implementation of binary_crossentropy, and this is the closest answer I’ve got. I can’t get the correct answer. I’ve even tried transposing as suggested on the forum, but I still couldn’t get the right answer. Please help. Thank You!

Please attach the function in a direct message to me, I will have a look for you.

The problem is because the wrongTF function is used and also, both the labels and logits have not been transposed.

Please see the implementation instructions for ex6.

tf.reduce_sum() is to be used for summing all the examples. But your code used tf.reduce_mean()

Yes, it worked now; thank you. By the way, I couldn’t find any instruction regarding transposing labels and logits in the instruction section of Exercise 6. Can you help me find it? Thank You

Here is one of the points listed in the implementation instruction:

• It’s important to note that the “`y_pred`” and “`y_true`” inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes).

``````Arguments:
logits -- output of forward propagation (output of the last LINEAR unit), of shape (6, num_examples)
labels -- "true" labels vector, same shape as Z3
``````

It is because the logits and labels are not in the shape that is expected by the categorical_crossentropy, so, in order for it to work correctly, we need to transpose logits and labels.

3 Likes

Oh right. Thanks for the assistance

Hello Kic,

As there is no indication of the required data shape in the documentation tf.keras.losses.categorical_crossentropy, would you please advise how to check the data shape required? Thanks.

Hello @daniel159357,

This one mentioned the shape requirement.

Cheers,
Raymond

1 Like

This is an extract from the reference menu that @rmwkwok is referring to:

Use this crossentropy loss function when there are two or more label classes. We expect labels to be provided in a `one_hot` representation.
There should be `num_classes` floating point values per feature, i.e., the shape of both `y_pred` and `y_true` are `[batch_size, num_classes]` .