Back propagation of last sigmoid layer

Yes, you can run into problems if the sigmoid values round to exactly 0 or exactly 1. Of course mathematically, they would never exactly equal 0 or 1, but in floating point you can run out of resolution.

Here’s a thread which shows discusses the strategy you also mentioned of perturbing the values slightly to make sure you avoid the exactly 0 or 1 cases.

1 Like