I answered on your other thread about this. There are two problems (once you solved the issue with the order of the arguments):
You do actually need those transposes.
You also need to tell the cost function that the “logits” input is raw linear outputs and not softmax outputs. That is done by using the from_logits
argument to the cost function. The default value of that optional parameter is False
, but that is not the appropriate value in this case.