Week 3 - Assignment - compute_total_loss - try to set from_logits=False

paulinpaloalto · December 16, 2022, 5:17pm

It’s an interesting point and a good experiment to run! We are operating in floating point here, so there are literally 2^{32} or 2^{64} different numbers we can represent between -\infty and +\infty depending on whether we use 32 bit or 64 bit floats. That’s pretty pathetic compared to the abstract beauty of \mathbb{R}. When we operate in a finite space like that, we have to deal with the issue of “numerical stability”. There can be different ways to express the same computation that are equivalent mathematically, but have different behavior w.r.t. the propagation of rounding errors when you are operating in a finite representation space like any type of floating point. The reason that the from_logits = True mode is used is that it is more numerically stable. That means it gives results that are closer to the actual correct answers we would get if we could use \mathbb{R}. It’s also less code to write, so that’s the way Prof Ng will always do it when we’re using TF loss functions: the output layer will omit the activation and have the loss function compute both the activation (sigmoid or softmax) and the cross entropy loss as a unified computation.

BTW numerical stability may sound like a bunch of hand-waving, but it’s actually not. In the subfield of math called Numerical Analysis, there is a way to reason precisely about the error propagation properties of different computations.

They only show the expected value to 6 decimal places and your answer rounds to the same value, but notice that they use 10^{-7} as the error threshold in the test. Try it again with the from_logits = True mode and it must be the case that the answer differs from the False answer in the 7th decimal place. You can print your loss value with a higher resolution than the default 6 decimal places to confirm this theory:

print("total_loss = {:0.10f}".format(total_loss))

Topic		Replies	Views
Bug in TensorFlow project Improving Deep Neural Networks: Hyperparameter tun coursera-platform	11	585	August 28, 2021
Course 2, Week 3, compute_total_loss, failed with really close result Improving Deep Neural Networks: Hyperparameter tun week-module-3 , coursera-platform	6	861	February 1, 2024
DLS Course2: Week 3 Exercise 6 (compute_total_loss method) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	15	1899	July 31, 2024
Compute cost failed to pass Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	562	August 3, 2021
Tensorflow_introduction, compute_total_loss() Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	1129	August 21, 2023

Week 3 - Assignment - compute_total_loss - try to set from_logits=False

Related topics