Because the dimensions of the output of the forward propagation are features
x samples
and the TF loss function requires the orientation with samples as the first dimension. They do mention in the instructions that you need to be aware of that. It’s also never a bad idea to read the documentation for the TF functions they are advising you to use. This is the intro to TF so there is a lot to learn.
Here’s an earlier thread that discusses this same point.