Not able to understand why getting error

[code removed]

Error I am getting

tf.Tensor(0.028503546, shape=(), dtype=float32)

AssertionError Traceback (most recent call last)
17 print(“\033[92mAll test passed”)
—> 19 compute_cost_test(compute_cost, new_y_train )

in compute_cost_test(target, Y)
13 print(result)
14 assert(type(result) == EagerTensor), “Use the TensorFlow API”
—> 15 assert (np.abs(result - (0.25361037 + 0.5566767) / 2.0) < 1e-7), “Test does not match. Did you get the mean of your cost functions?”
17 print(“\033[92mAll test passed”)

AssertionError: Test does not match. Did you get the mean of your cost functions?

logits and labels parameters in compute_cost have shape (num classes, num examples).

y_pred and y_true for tf.keras.losses.categorical_crossentropy should have shape (num examples, num classes).

Hope this helps.

1 Like

I did transpose logits and labels,while still got an error

here is my code:

[code removed by moderator]

Look at this link. Is there a parameter you should set since you are using logits?

Thanks!finally!it’s solved.
but I don’t understand" Whether y_pred is expected to be a logits tensor. By default, we assume that y_pred encodes a probability distribution."
what is a"logits tensor" and what does “probability distribution” mean here?

Here’s logit

I have the same question, and @balaji.ambresh’s link doesn’t really make it any clearer for me

The Dense output has a linear activation when no activation is specified i.e. it’s wx+b. To get the output as a probability for a dense unit, the activation should be sigmoid.

Here’s the derivation:

Start with the logit definition.

Let L = logit(p) i.e. the predicted outcome.
and p = probability of output = 1

\implies e^L = \frac{p}{1-p} , after raising both sides to power of e
\implies (1 - p) * e^L = p
\implies e^L - p * e^L = p
\implies e^L = (e^L + 1) * p
\implies p = \frac{e^L}{e^L + 1}
\implies p = \frac{1}{1+\frac{1}{e^L}} , after dividing both numerator and denominator by e^L

Which is the same as sigmoid.

Maybe see also this link tf.keras.losses.BinaryCrossentropy  |  TensorFlow Core v2.8.0

Which contains this key passage:

  • y_pred (predicted value): This is the model’s prediction, i.e, a single floating-point value which either represents a logit, (i.e, value in [-inf, inf] when from_logits=True ) or a probability (i.e, value in [0., 1.] when from_logits=False ).

The network can output either a value in range [-inf, inf] or [0., 1.] (depending on the activation used as @balaji.ambresh correctly shows above.) The loss function needs to know which it is in order to properly interpret the forward prop outputs. from_logits is used to keep the network and the loss function in synch.