Hi Paulin,
Thank you very much for this comment! I am getting the, apparently, correct output tf.Tensor(0.4051435, shape=(), dtype=float32). However, my notebook still tells me, the expected output is tf.Tensor(0.810287, shape=(), dtype=float32).
How can I get the new version of the notebook?
Thank you very much in advance!
Janine
1 Like
Hi, Janine.
If the “expected value” is 0.81xxx, then that is the new version of the notebook. They recently (early September 2022) changed the definition of the compute_cost
function in this assignment to be the sum of the costs across all the samples, rather than the mean of the costs across the samples. The reason that they did that is to make it consistent with how the compute_cost
function worked in the minibatch logic in the Optimization Assignment in C2 W2. When we are doing Minibatch Gradient Descent, it works better to just keep the running sum of the costs across all the samples in all the minibatches and then to compute the average cost only when we finish each Epoch (a full pass through all the minibatches). The reason is that all the minibatches will not be the same size if the batch size does not evenly divide the total training set size, right? We saw an example of that in the minibatch assignment that I mentioned above. So the average of the averages would not work in that case. To get the detailed picture, please have a look at how that logic worked in the previous exercise.
Please check that you used reduce_sum
and not reduce_mean
in your compute_cost
logic.
Also just as a general note: anytime you want to get a clean copy of the latest version of any assignment, there is a topic about how to do that on the DLS FAQ Thread (the very first topic there).
Regards,
Paul
1 Like
Hi Paul,
Thank you very much for your fast reply and your detailed explanation!
You are right, I changed reduce_mean to reduce_sum and now everything is fine.
Best regards,
Janine
1 Like
Hi Paulin,
I think I am having the new version as the expected value is the same as Janine’s. However, I am getting an output of 0.88x. Both logits and labels have the shape of (6,2) and I am sure that the input order of the arguments in tf.keras.losses.categorical_crossentropy() is correct (the other order gives NAN).
My code is as follows:
{moderator edit - solution code removed}
What could I possibly miss?
Foo
2 Likes
You are missing the fact that the inputs are logits and not activation outputs. Search the forums for from_logits
for more info. I think that is also not the correct shape. The loss function expects the Samples dimension to be first, right? So transposes are required in order to achieve that.
1 Like
Thank you so much Paul! It is not straightforward in the docs that which dimension comes first in the input argument. It never occurs to me to transpose the data.
Best,
Foo
1 Like
AssertionError Traceback (most recent call last)
in
25 print(“\033[92mAll test passed”)
26
—> 27 compute_total_loss_test(compute_total_loss, new_y_train )
in compute_total_loss_test(target, Y)
13 print("Test 1: ", result)
14 assert(type(result) == EagerTensor), “Use the TensorFlow API”
—> 15 assert (np.abs(result - (0.50722074 + 1.1133534) / 2.0) < 1e-7), “Test 1 does not match. Did you get the reduce sum of your loss functions?”
16
17 ### Test 2
AssertionError: Test 1 does not match. Did you get the reduce sum of your loss functions
This is your third post on this subject that I’ve seen so far. In this one you don’t actually show the result you get, but in one of the other ones it looks like you got the correct answer and then divided by 2 to get the average. But the point is that the goal here is not to compute the average: it is the sum across each minibatch. Here’s a thread which explains why that is the case.
1 Like