Wk 3, Prog. exercise 6: do I have to reshape the "logits" and "labels"?

Robert_Ascan · October 9, 2022, 1:02pm

In the exercise it is mentioned that " It’s important to note that the “y_pred” and “y_true” inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes). "

Do I understand it correctly that I have to reshape the logits (being the “y_pred”) – output of forward propagation (output of the last LINEAR unit), of shape (6, num_examples), as well as the “labels”, into matrices of shape (num_examples, 6) inside the tf.keras.losses.categorical_crossentropy()?

If I pass the labels (y_true) and logits (y_pred) as they are I get the following error (see the attached screenshot). Note that I did tf.reduce_sum the cost function.

paulinpaloalto · October 9, 2022, 3:06pm

Yes, if you study how they defined the forward propagation logic, you can see that the input data is arranged the way that Prof Ng has used up to this point: the first dimension is “features” and the second dimension is “samples”. That means by the output layer, we get “classes” as the first dimension and “samples” as the second. So you need to fix that, but using “reshape” is not the way to do that. You should use “transpose”. Those are two different things. You can end up with the correct shape using “reshape”, but the contents are not correct. The definition of transpose is to flip the matrix about the major axis, which is what you need in this case.

Robert_Ascan · October 9, 2022, 3:22pm

Thanks, @paulinpaloalto! I am still getting a wrong answer (see the screenshot below). I did tf.math.reduce_sum( … ). I left the default axis=None. I can’t figure out where the bug is!?

paulinpaloalto · October 9, 2022, 3:33pm

Did you do the transpose on both labels and logits? Are you sure you did it using tf.transpose instead of tf.reshape? Did you include from_logits = True to take account of the fact that the output of the final layer is linear activation?

Robert_Ascan · October 9, 2022, 3:50pm

Yes, I did tf.linalg.matrix_transpose() on both. I missed the “from_logits = True”. It worked now. Thanks a lot for your help!

Arul_Raj · October 11, 2022, 5:32am

I too am facing difficulty in passing this.
I used the tf.nn.softmax on logits → transpose → cross_ectropy with from_logits=True.
I get following output
tf.Tensor(10.425234, shape=(), dtype=float32).

Not sure where I am going wrong

paulinpaloalto · October 11, 2022, 2:38pm

The point of from_logits = True is that it tells the cost function to do the softmax internally. So you’ve done softmax twice, which is why it doesn’t work. Actually it’s even a little worse than that: if you did softmax before the transpose, then the softmax was also computed on the wrong axis. It matters, right?

Arul_Raj · October 11, 2022, 5:38pm

I tried without softmax as well.
The problem was I was using pred and actual in reverse. Interchanged and it worked.
Thanks for helping me out.

Topic		Replies	Views
Week 3 Exercise 6 - compute_total_loss. Why transpose? Improving Deep Neural Networks: Hyperparameter tun	3	704	November 21, 2023
Week 3 - Exercise 6 - Compute Total Loss Improving Deep Neural Networks: Hyperparameter tun	8	1299	April 3, 2024
ERROR: Did you get the reduce sum of your loss functions? Improving Deep Neural Networks: Hyperparameter tun week-3	1	30	September 25, 2024
DLS Course 2 week 3 exercise 6, I keep getting the wrong answer Improving Deep Neural Networks: Hyperparameter tun	1	578	February 15, 2022
DLS course2 week 3 exercise 6(compute total loss) Improving Deep Neural Networks: Hyperparameter tun	1	527	February 8, 2023

Wk 3, Prog. exercise 6: do I have to reshape the "logits" and "labels"?

Related topics