Jiajun
June 12, 2021, 2:45am
1
I try to compute the cost, here is the code i wrote
cost = tf.reduce_mean(tf.math.maximum(logits,0)-logits*labels+tf.math.log(1+tf.math.exp(-abs(logits))),axis=-1)
However, i got this error,
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
getting same error. Did u get the solution and can u pls help me with initilise parameters code, if possible
I think you should use the binary_crossentropy function as stated in section 3.2.
Please check if that helps.
1 Like
FatHaz
June 14, 2021, 3:38pm
4
Hi kampamocha,
I’m struggling with Exercise 6 - compute_cost. Can you give me any guidance in light of the following?
I can see that the lales are 1 or 0, and that the logits seem to be probabilities, hence from_logits=False, but something seems to be amiss.
Regards, Harry
Hi @FatHaz ,
You need to compute the mean over the binary_crossentropy function, please check the example shown in the initial paragraph on section 3.2. Also note the value of from_logits parameter, since this is associated to y_pred rather than labels.
FatHaz
June 14, 2021, 7:11pm
6
Hi kampamocha
Most obliged. I’d added the mean before I read your email, but the key was changing from False to True!
Thank you, Harry
Fredy
July 4, 2021, 5:40pm
7
your were missing the other call of tf function as tf.keras.metrics.binary_crossentropy
by thw way, there was other ways i found can share as below,
cost = tf.reduce_mean(tf.reduce_mean(tf.maximum(logits, 0) - logits * labels + tf.math.log(1 + tf.math.exp(-abs(logits))), axis=-1))
Can some please explain why the code in the paragraph above the cost exercise, that is -
cost = tf.reduce_mean(tf.keras.losses.binary_crossentropy(y_true = logits, y_pred = labels, from_logits=True))
is not working.
Whereas, -
cost = tf.reduce_mean(tf.reduce_mean(tf.maximum(logits, 0) - logits * labels + tf.math.log(1 + tf.math.exp(-abs(logits))), axis=-1))
Is working properly?
Please if someone can explain.
Thank you,
Saransh
Hi, @Saransh_Jhunjhunwala .
Make sure you understand the difference between y_true and y_pred and let me know if you need any hints.
Check this post if you’re not sure where the second formula comes from.
Good luck with the assignment
Understood my mistake,
Thank you
2 Likes
cost = tf.reduce_mean(tf.keras.losses.categorical_crossentropy(tf.transpose(labels),tf.transpose(logits),from_logits = True))
it works
and in your answer, notice that y true = labels
3 Likes
Hussam
August 17, 2021, 12:14pm
12
I think you should use
tf.keras.metrics.categorical_crossentropy
instead of the binary.
Hussam
August 17, 2021, 12:15pm
13
I think tf.keras.metrics.categorical_crossentropy is the correct implementation
Hi @Hussam ,
You are right, we should use categorical_cross_entropy
. At the time of my previous comment the notebook specified that you should use binary_cross_entropy
, later that was revised and changed to what it is now, which makes more sense.
If you are interested, you can check some discussions about it:
That’s because they want you to use the Binary_crossentropy loss function with the from_logits = True argument, which causes the sigmoid calculation to be incorporated in the loss computation. That is preferred because they can manage numerical stability better when they do the two together. E.g. dealing with problems with “saturated” sigmoid output values. Here’s the doc page for Binary_crossentropy .
They don’t really explain what the from_logits = True argument does in the assignment, but the…
Update: Please note that after this thread was created, the Course Staff made some significant updates to this assignment, which include switching to using CategoricalCrossEntropy for the loss function in this section.
Because we specify the from_logits = True argument, that means that the loss logic will apply either sigmoid or softmax to the logits to compute the actual \hat{y} values and then will compute the cross entropy loss between the predictions and the labels:
-y_true * log(sigmoid(y…
4 Likes
Why did we need to transpose the labels and logits?
Read this where Mentor Kin @Kic explains why we need to do that.
Best,
Saif.
1 Like
Also note that this is an old thread and refers to the way this assignment used to work. The transpose and from_logits
parts are still valid, but they changed it a while ago so that we sum the loss values instead of computing the mean. Here’s a thread which explains why they made that change.
1 Like