can i just check why is there a need to use transpose and also can i just clarify, the from_logits is set to be true in the loss function to apply softmax on the logits input right?
Does this help?
Helps with the understanding but the transpose wise i dont really get it
If you look at how the forward propagation logic is set up, the format of the data has the “samples” dimension as the second dimension. That’s the way we formatted our data in Course 1 as well and up to this point in Course 2. But it turns out that if you read the TensorFlow documentation for the categorical loss function, you’ll see that it expects the “samples” dimension to be the first dimension. That is why we need the transpose.
i see so its because we are using it in tensorflow so we need to have the transpose to meet its requirements, am i right to say so?
Your understanding is correct.
Please look at the documentation from the link provided here by @ai_curious