They don’t use higher resolution, but it turns out that different algorithms have different properties w.r.t. how rounding errors propagate. Of course we are dealing with exponents of e for the softmax or sigmoid and logarithms for the cross entropy loss. One concrete example is that with either sigmoid or softmax, the values can “saturate” and round to exactly 0. or 1. and then you get NaN for the cost because log(0) is -\infty. When you’re doing both computations together, they can catch that case and just use a number very close to 0. or 1. so that the cost is an actual value.
This is real math. Google “numerical analysis” and once you find a good site, read the section about “error propagation”. Or if you want a concrete example, complete the experiment described on this thread and you’ll see that the answers do differ in the 7th decimal place.
Or you could look at the actual TF code to see what they do. I’ve never actually had the guts to do that, but it is Open Source, right?