Actually your total loss value looks correct, but then you computed the average across the samples. Here’s a thread which explains why that is not what is intended here.
Actually your total loss value looks correct, but then you computed the average across the samples. Here’s a thread which explains why that is not what is intended here.