Inside the function, it only computes up to Z3, not A3. In forward propagation, we have been always compute A3 in previous courses, so the function returns the activation, not the linear output.
Z3 = tf.matmul(W3, A2) + b3
Is this because how tensorflow handles the forward and backward propagations?
Hello @Martinmin,
Where is that line of code from? Which assignment in which week?
Raymond
C3_W3, Tensorflow Introduction.
{code removed by mentor}
Hello @Martinmin,
Thanks for sharing the source, but I have to remove the code because we can’t share your work here. Thanks again for sharing the source.
This assignment is for a multi-class classification problem. One way to do is to use softmax as activation in the output layer which will then output probabilities, and then use a loss function that accepts those probabilities. This is one way we have learnt in the lectures.
Now, there is another way, which is to not use the softmax, and no activation at all. Then instead of using a loss function that accepts probabilities, we use a loss function that accepts “logits” - without softmax transformation, they are logits. In other words: we have this transformation relationship: logits → softmax → probabilities. You will need to implement such loss function in exercise 6, so I have to leave this part for you to figure out.
I cannot find the DLS lecture which talks about logits vs probabilities right now, but since I have bookmarked this link which also talks about that, please watch it if you want some explanation from Andrew on these 2 choices and why we prefer to go for logits.
Cheers,
Raymond
Thanks for the explanation, @rmwkwok.
So, tf.keras.losses.categorical_crossentropy() basically is such a loss function that doesn’t use activation output, but use logits as input, and that’s why in tensorflow we do this way in the assignment?
Hello @Martinmin,
99% correct, except that there is a from_logits
argument in categorical_crossentropy to let you make a choice. Please check out the explanation of it in the doc.
Raymond
Yes, it accommodate two possibilities: from logits or activations by default.