C2_W2_SoftMax lab

Hey, Everyone i have a little doubt
Q1- The selected line of the above notebook, is that correct? Let me also attach chat gpt Answer.

I have one more question-
Q2- if we dont use softmax in the last layer for better computation then how does our model verify output for each epoch and train the weights?

Yes, it is correct. This is discussed in the lectures.

I recommend you not use a chat robot for programming advice.

As written, TensorFlow will automatically compute the sigmoid activation and softmax.

1 Like

hello @tarunsaxena1000

Tarun refer the below link to understand why from_logits=true holds significance in loss i.r.t.softmax activation when not used in the last dense layer in model architecture.

feel free to ask any more doubts.

regards
DP

also I agree with Tom on use of chatbot, you could rather explored tensorflow

https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy

So what you guys are saying is if the final layer is linear activation, and we do

from_logits = True.

Then the loss function will expect the logits ranging from (-infinity,+infinity) and internally apply softmax for the calculation of loss from y_train which ranges from [0,N])
(Deepti thanks for the article)

No the understanding should be because we are not using softmax activation in the last dense layer, the loss should include the from_logits to include the raw logits, i.e. softmax activation as the loss choice here is SparseCategoricalcrossentropy which is compatible with softmax activation and Sparsecategoricalcrossentropy is not right choice for linear or sigmoid activation in the last dense layer.

1 Like