Cannot compute_cost course 2 week 3

ChandiniSaiKumarDukk · July 25, 2021, 3:38pm

@paulinpaloalto Can you please help me here

paulinpaloalto · July 25, 2021, 3:46pm

There is at least one problem with your code, but I don’t see how it could cause that error. Remember that the forward propagation routine specifically does not include the activation function on the output layer. That means you need to add the argument from_logits = True to the loss function to tell it to do the softmax calculation internally.

The mismatching shapes must mean that either the code you are showing is not what you actually ran (e.g. you had not done “Shift-Enter” to run the cell since the last time you changed the code) or you have hacked on the test cell to change one of the parameters so that they don’t match.

ChandiniSaiKumarDukk · July 25, 2021, 4:02pm

I think, I haven’t changed any of the parameters and implemented even using from_logits = True but that doesn’t workout for me.

{moderator edit - solution code removed}

Still I am seeing the same error.

I am unable to proceed further with this error

ChandiniSaiKumarDukk · July 25, 2021, 4:08pm

@paulinpaloalto All the above test cases are passed successfully.

paulinpaloalto · July 25, 2021, 5:27pm

So you must have modified the inputs to the test case, right?

ChandiniSaiKumarDukk · July 25, 2021, 5:35pm

No, I haven’t done that ever since I am even unable to edit that particular cell consisting of the definition compute_cost_test.
The only changes I have made is writing the 3 lines of code.

paulinpaloalto · July 25, 2021, 7:15pm

Ah, that’s a good point. Well, notice that the “labels” input is a pre-defined variable new_y_train that was created by a much earlier cell in the notebook. Maybe there is something wrong with the logic in your notebook that creates new_y_train.

I added a cell right before the compute_cost test cell:

print(new_y_train)
mbs = new_y_train.batch(2)
for mb in mbs:
    print(mb)
    break

When I run that, here’s what I see:

<MapDataset shapes: (6,), types: tf.float32>
tf.Tensor(
[[0. 0. 0. 0. 0. 1.]
 [1. 0. 0. 0. 0. 0.]], shape=(2, 6), dtype=float32)

Please try that and see if you end up with an output that is 2 x 4 instead.

ChandiniSaiKumarDukk · July 25, 2021, 7:21pm

Yes I am able to get an output of 2x4

paulinpaloalto · July 25, 2021, 7:23pm

Ok, that’s wrong. So why is it wrong? You have to examine the earlier logic that creates new_y_train. This is called “debugging”, right? You reason from the evidence you see and work backwards. Things don’t just happen for mysterious reasons: your job is to follow the evidence to understand why this happened.

Look at the logic of your “one hot” routine, since new_y_train is the output of that function. You probably hard-coded the number of classes, instead of using the depth parameter.

ChandiniSaiKumarDukk · July 25, 2021, 7:42pm

Yes, I got it now Thanks for your whole support @paulinpaloalto
I have changed the test case from 4 to 6.

605074976 · July 31, 2021, 2:54pm

I still have errors here after I read through and tried to debug my code.
I did transpose two inputs and and added from_logits=True as following:

cost = tf.reduce_mean(tf.keras.losses.categorical_crossentropy(y_true=tf.transpose(labels), y_pred=tf.transpose(logits)，from_logits=True))

but the result I got is :
File “”, line 20
cost = tf.reduce_mean(tf.keras.losses.categorical_crossentropy(y_true=labels, y_pred=logits，from_logits=True))
^
SyntaxError: invalid character in identifier

@paulinpaloalto

paulinpaloalto · July 31, 2021, 11:08pm

My guess is that there is an unprintable character somewhere adjacent to one of the variable names either on that line or perhaps on the line before. Try backspacing over all the names and retyping them and see if that helps. If that doesn’t work, you can also try getting a clean copy of the notebook and then carefully “copy/pasting” over your completed code.

amin.pahlavani · August 12, 2021, 7:33pm

Hi @paulinpaloalto and @joduss

Would you please explain why do we need to transpose BOTH labels and logits using tf.keras.losses.categorical_crossentropy? What is the logic (math) behind this?

Thanks in advance!

paulinpaloalto · August 12, 2021, 7:52pm

It’s not a matter of logic or math: it is the definition of the API. I also don’t understand why you emphasize “both”: why wouldn’t you want the two tensors to be oriented the same way? Either transposed or not …

jonaslalin · August 12, 2021, 7:58pm

If I may continue @paulinpaloalto reasoning:

I have seen similar posts like these, and I think the confusion arises because Prof Andrew Ng uses [features, batch_size], whereas TensorFlow prefers [batch_size, features]. However, the order is just a matter of taste. Similar to whether num_channels should be the first or the last dimension for convolutions.

amin.pahlavani · August 13, 2021, 4:46pm

Thank you so much @paulinpaloalto.

The reason I emphasize both is I was thinking about the vectorized implementation logic, which the shape of the two matrices has to be aligned to get a proper result. If this is the case, I was wondering why we didn’t just transpose one tensor (instead of both) to satisfy the logic?

In the API documentation link, categorical_crossentropy, it doesn’t mention that we need to transpose y_true and y_pred. What do I miss?

paulinpaloalto · August 17, 2021, 1:35pm

@amin.pahlavani: What you’re missing is the point that @jonaslalin made in his earlier reply: the TF APIs are defined to expect a particular ordering of the dimensions of the tensors and Prof Ng has chosen to use a different ordering, so we frequently need to adjust things.

KennyXie · August 24, 2021, 5:15am

Hi Paul,

I am currently having trouble with the compute cost function.

This is what I currently have, I believe this is correct with what was said above (need to transpose and also include from_logit =True), but I am still getting an error saying that it does not match with test case.

[removed code]

KennyXie · August 24, 2021, 5:18am

Oh hey sorry, just realized that I had binary_crossentropy in place instead of categorical_crossentropy. I can get rid of the code I put above if you’d like.

edit: removed

paulinpaloalto · August 24, 2021, 2:25pm

Hi, Kenny.

It’s great that you figured out the solution under your own power! Thanks for confirming and for removing the source code.

Cheers!
Paul

Topic		Replies	Views
Bug in TensorFlow project Improving Deep Neural Networks: Hyperparameter tun	11	576	August 28, 2021
C4W2 ex. 3 Convolutional Neural Networks	5	381	August 9, 2023
Week 3 Assignment Binary cross entropy Improving Deep Neural Networks: Hyperparameter tun	2	1039	October 13, 2022
Problem with Course 2 Week 3 Assignment Improving Deep Neural Networks: Hyperparameter tun	6	716	February 22, 2023
Course2, Week3, Exercise 6 Improving Deep Neural Networks: Hyperparameter tun	2	691	August 14, 2021

Cannot compute_cost course 2 week 3

Related topics