I actually did spot something, I think it is an error and if it is not an error, I guess I am missing something then.
in the programming assignment of course 2, week 3 where we got to implement the cost function, in the preceding text that described the assignment, it was noted that the inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes). but the output of Z3 is of shape (number of classes, number of samples). I think the appropriate for both labels and logits should be of shape (number of classes, number of samples).
Also, although I have successfully applied tf.reduce_sum and tf…keras.losses.categorical_crossentropy, my result is around 0.88 as against 0.81(the correct answer)
I would really appreciate comments. Thank you in advance
But that is not what the cost function expects, which is the point of those comments in the assignment. So how would one deal with that? Have you considered applying the “transpose” operation to the labels and logits before passing them to the cost function?
Also note that the Z3 outputs are “logits” and not softmax output values. So you have two choices 1) apply softmax yourself or 2) use the from_logits argument to tell the cost function to do that computation internally. Option 2) is the better choice, which is why the given code does not include the softmax logic.