Deep learning specialization, Course 2 week 3, coding exercise compute_total_loss

so for the coding exercise where we have to compute total_loss with reduce_sum and categorical_crossentropy, why is that we have to transpose the labels and logits? I just wanted some insight.

It’s because of how that function works. No other reason, it’s an implementation detail.

1 Like

Here’s a thread which discusses that.