The line of text in the documentation cell: * tf.reduce_mean
basically does the summation over the examples. misled me in thinking that I need to divide the result of the reduce_mean() function by the number of examples when I got an error because I did not set from_logits=True. Here the word mean refers to mean, not summation. Hope this helps others.
Thanks for pointing that out. Yes, the way they name those functions in TF is maybe not as intuitive as you might wish. But if you read the documentation, they do explain it. There are a number of reduce_* functions, e.g.:
reduce_mean
reduce_sum
reduce_min
reduce_max
Anyone who has questions about how and when to use those should have a look at the doc links I gave above.
The other key points on compute_cost are the need for a transpose and the fact that the inputs are logits and not activation values, thus requiring the appropriate value of the from_logits parameter as you pointed out. Unfortunately (if my memory serves) they don’t really call that out in the instructions, but just leave it “implicit”.