This third week’s notebook is a real mess
There is no way to compute_cost equal to the given answer.
All dimensions are correct.
specify from_logits = False, cost = 0.88275003
specify from_logits = True, cost = 0.17102128
the given answer is 0.810287 (How ?)
Can anyone please suggest me how to obtain the correct result from this test ?
Otherwise, I’m doubting whether the given answer is correct or not.
Thanks
Hi,
Instead of passing in logits and labels to cross_entropy, pass the values received from tf.transpose(logits) and tf.transpose(labels) respectively. That has worked for me and has been mentioned by @muhammadahmad above
Apart from @Ethan1312 's suggestion, I also had to pass the argument from_logits = True
.
they must mention that I spent almost 1hr trying what is wrong with my code
i’m using this code:
total_loss = tf.reduce_sum(tf.keras.losses.categorical_crossentropy( y_true=tf.transpose(labels), y_pred=tf.transpose(logits), from_logits = True))
but it is showing me the error as:
ValueError: Shapes (2, 4) and (2, 6) are incompatible
and it makes sense as in we calculated new_y_train is (4,2) and the pred matrix is (6,2).
is something up with the test case??? im not sure
Hello @Anubhav_Anand2,
Are you talking about the compute_total_loss_test
?
new_y_train
is a Dataset that has a shape of (6, )
. Note that new_y_train
is NOT a tensor, and it does not carry a shape of (4, 2).
Since we apply .batch(2)
on it, each minibatch
will be a tensor of the shape (2, 6)
, and after transposing, it matches with the shape of pred
which is (6, 2)
.
You might want to check what went wrong in your code.
Cheers,
Raymond
In my assignment, the size of new_y_train is (4,2).
So here we are calculating logits of shape (6,2) and we have labels of (4,2).
I have also tried cascading ([0,0], [0,0]) to label to make it a (6,2) matrix, which makes the code run okay, but the calculated loss does not match the test case.
Hello @Anubhav_Anand2,
Since we have a different new_y_train
, I suggest you to check where you have assigned anything to new_y_train
.
- On the notebook, press “Ctrl + F”
- Type new_y_train in the search box.
- Go through all the search results.
This is the only assignment of new_y_train
in my notebook:
y_train
is a Dataset object, and y_train.map(...)
will also return a Dataset object. If you print new_y_train
, this is what it is going to say:
Now, how many assignments of new_y_train
are there in your notebook? Only 1 just like mine? If it is just 1, is it the same assignment as mine? If you have more than 1 assignment, then you have changed new_y_train
in a way that’s not supposed to be, and can you revert all those changes?
Cheers,
Raymond
yes, in my notebook I’m assigning that once only… I think the problem is with the one_hot_matrix method. as i was asked to make a method to carry out one hot encoding for classes only
as you can see it is passing all test cases for shape (4,)…
but later on during compute_cost method it is asking for shape (6,2)
Transpose
logits
and
labels
´
This is the more explanation.
Using tf.transpose() works in this case!
Hello!
I have a problem with the computation of the cost in the function # GRADED FUNCTION: compute_cost of the last Assignment of Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization.
I was sure I had no mistake, while the expected cost is 0.810287 and my function does not pass the tests. Can you verify, please, that there are no mistakes in the verification part? Also, the explanation part according to which we have to implement the function is confusing. Can I have a discussion with the instructor to clarify? Thank you!
ValueError: Shapes (6, 2) and (1, 6, 2) are incompatible
MY CODE IS
YOUR CODE STARTS HERE
total_loss = tf.reduce_sum(tf.keras.losses.categorical_crossentropy(logits,labels,from_logits=True))
return total_loss
Thanks for opening the topic. I stumbled across the funny thing that the linked TensorFlow documentation for categorical_crossentropy claims there is an ‘axis’ parameter: So in theory, by providing axis=0 to categorical_crossentropy, it should work without doing the transpose operations. However, in practice, TensorFlow raises a TypeError due to some dispatch mechanisms that the axis parameter is NOT supported, thus requiring a workaround with manual transposing of the labels and logits .
Note that alternatively, tf.nn.softmax_cross_entropy_with_logits can be used with axis=0 without the need for transposing.
You need to transpose logits and labels inputs before using them in categorical_corossentropy. You should use the ts.transpose method for that. Another thing is that you are not using the correct order of the arguments (first you need to use labels and, secondly, logits)
Loss per example: [0.25361034 0.5566767 ]
Total loss: 0.810287
Average loss: 0.4051435
Test 1: tf.Tensor(0.4051435, shape=(), dtype=float32)
AssertionError Traceback (most recent call last)
in
25 print(“\033[92mAll test passed”)
26
—> 27 compute_total_loss_test(compute_total_loss, new_y_train )
in compute_total_loss_test(target, Y)
13 print("Test 1: ", result)
14 assert(type(result) == EagerTensor), “Use the TensorFlow API”
—> 15 assert (np.abs(result - (0.50722074 + 1.1133534) / 2.0) < 1e-7), “Test 1 does not match. Did you get the reduce sum of your loss functions?”
16
17 ### Test 2
AssertionError: Test 1 does not match. Did you get the reduce sum of your loss functions?
Here is the list of common mistakes many learners do in this exercise.
Actually your total loss value looks correct, but then you computed the average across the samples. Here’s a thread which explains why that is not what is intended here.
I transposed both labels and logits and it worked. But I don’t understand why. The documentation on categorical_ crossentropy doesn’t discuss the shape of the inputs.