You’re right that the cost should never be negative, so there must be something fundamentally wrong. If all your previous functions pass, the bug could be in train_step.
There are ways to get negative costs due to integer rounding errors. Here’s a thread which shows examples of the issues there, although I think that the current version of TF does not suffer those problems.