Test loss in W4 L2/L3

BM-2311 · December 30, 2021, 2:24pm

Hi there.

I am having a bit of confusion with the test losses in these two labs.

In both labs, the test_loss is created as a tf.keras.metrics.Mean object within the strategy scope and its internal state is updated within the test_step function.

The confusion arises when printing out the resulting loss in the training loop; in lab 2 (and in the accompanying video), it is simply the result of test_loss, whereas in lab 3, it is the result of test_loss scaled by the number of replicas in sync.

I am struggling to see why this extra division is done in lab 3. Surely each call to update_state increments the count by 1, so it shouldn’t be needed? On the other hand, without this extra division, the test loss is significantly higher than one would expect.

Is it the case that within the Mean object, for each update to count, there are num_replicas_in_sync additions to the total? This would explain the extra division factor required in lab 3, but doesn’t explain the lack of division in lab 2.

Any help with understanding the reasoning of the extra division will help.

Thanks in advance!

gent.spah · January 6, 2022, 3:39pm

In Lab 2 the Global batch size is set before the compute_loss and is the batch_size times the num of replicas (see Prepare data section) but in lab 3 is done at the compute_loss function at return.

Topic		Replies	Views
W4L2. About train_loss and test_loss Custom and Distributed Training with TF week-module-4	5	553	September 2, 2022
Week 3 - compute_total_loss Incorrect Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	5757	August 13, 2023
C2W4 - Averaging the total loss Custom and Distributed Training with TF week-module-4	1	511	August 5, 2023
DLS C2W3 compute_total_loss_test Grader Error (2024) Improving Deep Neural Networks: Hyperparameter tun week-module-3 , coursera-platform	4	372	February 6, 2024
Course 2 week 3 programming assignment Improving Deep Neural Networks: Hyperparameter tun week-module-3 , coursera-platform	6	684	February 17, 2024

Test loss in W4 L2/L3

Related topics