Hi, in C5, W1, A3

the instruction on ex. 2 is asking to find the indices with the maximum probabilities with **tf.math.argmax** and convert them to one-hot with **tf.one_hot**

But ex. 3 is asking to find the indices with the maximum probabilities with **np.argmax** and then use **to_categorical**, to convert them to one-hot vectors.

Ok, I did accordingly for the sake of the assignment. But in real life, what should we choose?

In general it doesn’t really matter, except in the case that the functions involved are part of the compute graph for any tensor for which you need gradients. In TF the gradients are automatically computed for you using something like “autodiff” and only TF functions have that logic: if you insert a numpy function into the compute graph, the training will fail with a message about gradients not being available.

To see an example of this in action, go back to DLS C2 W3 to the Tensorflow Introduction assignment. In the `compute_total_loss`

function, you need to transpose the logits and labels using `tf.transpose`

. Try rewriting the code using .T or `np.transpose`

and watch what happens: you’ll pass the test case for `compute_total_loss`

, but it will throw the error I described above when you run the training in the later cell.

I’m sure you can find ways to reproduce that behavior here in the C5 assignments. But from an experimental standpoint you can just try using the numpy functions and see what happens. If it “just works”, then you’re ok, meaning that the computations in question were not part of the compute graph for any gradients.