In this function we are computing the sum, not the average of the costs. The other thing to check is to make sure you used the from_logits
parameter correctly. We are passing the “logits” here and not the softmax output values, right? Here’s a thread which talks about that and why it is done that way.
Here’s a thread which talks about why it’s the sum, not the average.