Hi, I have tried different versions of the implementation of binary_crossentropy, and this is the closest answer I’ve got. I can’t get the correct answer. I’ve even tried transposing as suggested on the forum, but I still couldn’t get the right answer. Please help. Thank You!
Hi @Syed_Hamza_Tehseen ,
Please attach the function in a direct message to me, I will have a look for you.
Hi @Syed_Hamza_Tehseen ,
The problem is because the wrongTF function is used and also, both the labels and logits have not been transposed.
Please see the implementation instructions for ex6.
tf.reduce_sum() is to be used for summing all the examples. But your code used tf.reduce_mean()
Yes, it worked now; thank you. By the way, I couldn’t find any instruction regarding transposing labels and logits in the instruction section of Exercise 6. Can you help me find it? Thank You
Here is one of the points listed in the implementation instruction:
- It’s important to note that the “
y_pred
” and “y_true
” inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes).
If you read the comments on the input arguments:
Arguments:
logits -- output of forward propagation (output of the last LINEAR unit), of shape (6, num_examples)
labels -- "true" labels vector, same shape as Z3
It is because the logits and labels are not in the shape that is expected by the categorical_crossentropy, so, in order for it to work correctly, we need to transpose logits and labels.
Oh right. Thanks for the assistance
Hello Kic,
As there is no indication of the required data shape in the documentation tf.keras.losses.categorical_crossentropy, would you please advise how to check the data shape required? Thanks.
Hi @daniel159357 ,
This is an extract from the reference menu that @rmwkwok is referring to:
Use this crossentropy loss function when there are two or more label classes. We expect labels to be provided in a one_hot
representation.
There should be num_classes
floating point values per feature, i.e., the shape of both y_pred
and y_true
are [batch_size, num_classes]
.