Here are the things to check:
- Make sure you did the transpose on the inputs.
- Make sure you use the
from_logitsoption to tell the cost function that you are giving it logits and not activation values. - Make sure you use
reduce_sumand notreduce_meanto get the final scalar value. - Make sure you use the loss function specified in the instructions.
- Make sure you specify the positional arguments to the loss function in the correct order.