Course 2, Week 3, compute_total_loss, failed with really close result

Wenjie_Zheng · October 21, 2022, 4:00pm

My result is tf.Tensor(0.81028694, shape=(), dtype=float32)
The expected one is tf.Tensor(0.810287, shape=(), dtype=float32)
They are really close but different enough to not pass the test.

I used tf.transpose to transpose both tensors.
I used tf.nn.softmax to compute y_pred.
I used tf.keras.losses.categorical_crossentropy and tf.reduce_sum to compute the cost function.

paulinpaloalto · October 21, 2022, 4:24pm

Instead of manually including the softmax, try using the from_logits argument to tell the cost function to do that internally. That is more efficient and is the point of why they did not already include softmax when they defined forward propagation.

Wenjie_Zheng · October 26, 2022, 8:45am

This answer can be improved by explaining the numeric difference between the from_logits keyword and tf.nn.softmax.

paulinpaloalto · October 26, 2022, 4:27pm

There should be no difference in principle, if we were doing pure math here. But the problem is we are living in the pathetically limited world of 64 bit floating point. There are literally only 2^{64} distinct values you can represent as opposed to the abstract beauty of \mathbb{R} in which you have \aleph_1 possible values either on the whole number line or between 0.0001 and 0.0002. As a result, different ways of implementing a computation or even doing it in a slightly different order can give different rounding behavior. They must have done an exact equality comparison in the grader test case or used too small an error threshold if they used numpy allclose.

The high level way to state that is the reason that the from_logits = True method is preferred is because it is more numerically stable, meaning that it has better behavior with respect to the propagation of rounding errors.

Sazia_Afreen · January 29, 2024, 4:08pm

I am getting
Test 1: tf.Tensor(0.17102128, shape=(), dtype=float32)
instead of the expected output
Test 1: tf.Tensor(0.810287, shape=(), dtype=float32)

paulinpaloalto · February 1, 2024, 7:26pm

Here is a checklist of the most common problems on that function. I think the incorrect value you show is the result of the first mistake on that list.

Topic		Replies	Views
Inaccuracy in compute_cost in DLS2 W3A1 Tensorflow_introduction Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	631	January 17, 2023
Unable to understand tensorflow class assigment error Improving Deep Neural Networks: Hyperparameter tun coursera-platform	9	707	December 1, 2022
Week 3 - Assignment - compute_total_loss - try to set from_logits=False Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	16153	July 23, 2023
Bug in TensorFlow project Improving Deep Neural Networks: Hyperparameter tun coursera-platform	11	582	August 28, 2021
DLS 2 Week 3_Exercise_6_compute_cost()_ERROR Improving Deep Neural Networks: Hyperparameter tun coursera-platform	28	1638	August 28, 2024

Course 2, Week 3, compute_total_loss, failed with really close result

Related topics