DLS 2 week 3 exercise 6 compute_cost

This third week’s notebook is a real mess :triumph:

3 Likes

There is no way to compute_cost equal to the given answer.

All dimensions are correct.

specify from_logits = False, cost = 0.88275003
specify from_logits = True, cost = 0.17102128

the given answer is 0.810287 (How ?)

Can anyone please suggest me how to obtain the correct result from this test ?
Otherwise, I’m doubting whether the given answer is correct or not.
Thanks :slight_smile:

1 Like

Hi,
Instead of passing in logits and labels to cross_entropy, pass the values received from tf.transpose(logits) and tf.transpose(labels) respectively. That has worked for me and has been mentioned by @muhammadahmad above

6 Likes

Apart from @Ethan1312 's suggestion, I also had to pass the argument from_logits = True.

3 Likes

they must mention that I spent almost 1hr trying what is wrong with my code

1 Like

i’m using this code:
total_loss = tf.reduce_sum(tf.keras.losses.categorical_crossentropy( y_true=tf.transpose(labels), y_pred=tf.transpose(logits), from_logits = True))

but it is showing me the error as:
ValueError: Shapes (2, 4) and (2, 6) are incompatible

and it makes sense as in we calculated new_y_train is (4,2) and the pred matrix is (6,2).
is something up with the test case??? im not sure

1 Like

Hello @Anubhav_Anand2,

Are you talking about the compute_total_loss_test?

image

new_y_train is a Dataset that has a shape of (6, ). Note that new_y_train is NOT a tensor, and it does not carry a shape of (4, 2).

Since we apply .batch(2) on it, each minibatch will be a tensor of the shape (2, 6), and after transposing, it matches with the shape of pred which is (6, 2).

You might want to check what went wrong in your code.

Cheers,
Raymond

1 Like

In my assignment, the size of new_y_train is (4,2).

So here we are calculating logits of shape (6,2) and we have labels of (4,2).

I have also tried cascading ([0,0], [0,0]) to label to make it a (6,2) matrix, which makes the code run okay, but the calculated loss does not match the test case.

1 Like

Hello @Anubhav_Anand2,

Since we have a different new_y_train, I suggest you to check where you have assigned anything to new_y_train.

  1. On the notebook, press “Ctrl + F”
  2. Type new_y_train in the search box.
  3. Go through all the search results.

This is the only assignment of new_y_train in my notebook:
image

y_train is a Dataset object, and y_train.map(...) will also return a Dataset object. If you print new_y_train, this is what it is going to say:
image

Now, how many assignments of new_y_train are there in your notebook? Only 1 just like mine? If it is just 1, is it the same assignment as mine? If you have more than 1 assignment, then you have changed new_y_train in a way that’s not supposed to be, and can you revert all those changes?

Cheers,
Raymond

1 Like


yes, in my notebook I’m assigning that once only… I think the problem is with the one_hot_matrix method. as i was asked to make a method to carry out one hot encoding for classes only


as you can see it is passing all test cases for shape (4,)…

but later on during compute_cost method it is asking for shape (6,2)

1 Like

Transpose

logits

and

labels

´

1 Like

This is the more explanation.

Using tf.transpose() works in this case!

2 Likes

Hello!

I have a problem with the computation of the cost in the function # GRADED FUNCTION: compute_cost of the last Assignment of Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization.

I was sure I had no mistake, while the expected cost is 0.810287 and my function does not pass the tests. Can you verify, please, that there are no mistakes in the verification part? Also, the explanation part according to which we have to implement the function is confusing. Can I have a discussion with the instructor to clarify? Thank you!

ValueError: Shapes (6, 2) and (1, 6, 2) are incompatible

MY CODE IS

YOUR CODE STARTS HERE
total_loss = tf.reduce_sum(tf.keras.losses.categorical_crossentropy(logits,labels,from_logits=True))
return total_loss
1 Like

Thanks for opening the topic. I stumbled across the funny thing that the linked TensorFlow documentation for categorical_crossentropy claims there is an ‘axis’ parameter: So in theory, by providing axis=0 to categorical_crossentropy, it should work without doing the transpose operations. However, in practice, TensorFlow raises a TypeError due to some dispatch mechanisms that the axis parameter is NOT supported, thus requiring a workaround with manual transposing of the labels and logits :man_shrugging:t3:.

Note that alternatively, tf.nn.softmax_cross_entropy_with_logits can be used with axis=0 without the need for transposing.

2 Likes

You need to transpose logits and labels inputs before using them in categorical_corossentropy. You should use the ts.transpose method for that. Another thing is that you are not using the correct order of the arguments (first you need to use labels and, secondly, logits)

2 Likes

Loss per example: [0.25361034 0.5566767 ]
Total loss: 0.810287
Average loss: 0.4051435
Test 1: tf.Tensor(0.4051435, shape=(), dtype=float32)

AssertionError Traceback (most recent call last)
in
25 print(“\033[92mAll test passed”)
26
—> 27 compute_total_loss_test(compute_total_loss, new_y_train )

in compute_total_loss_test(target, Y)
13 print("Test 1: ", result)
14 assert(type(result) == EagerTensor), “Use the TensorFlow API”
—> 15 assert (np.abs(result - (0.50722074 + 1.1133534) / 2.0) < 1e-7), “Test 1 does not match. Did you get the reduce sum of your loss functions?”
16
17 ### Test 2

AssertionError: Test 1 does not match. Did you get the reduce sum of your loss functions?

Here is the list of common mistakes many learners do in this exercise.

Actually your total loss value looks correct, but then you computed the average across the samples. Here’s a thread which explains why that is not what is intended here.

I transposed both labels and logits and it worked. But I don’t understand why. The documentation on categorical_ crossentropy doesn’t discuss the shape of the inputs.