Here are the things to check:
- Make sure you did the transpose on the inputs.
- Make sure you use the
from_logits
option to tell the cost function that you are giving it logits and not activation values. - Make sure you use
reduce_sum
and notreduce_mean
to get the final scalar value. - Make sure you use the loss function specified in the instructions.
- Make sure you specify the positional arguments to the loss function in the correct order.