WeightedCategoryCrossEntropy in C3 W1

Hey folks! In the assignment we build the following model:

Serial[
  Embedding_9088_256
  Mean
  Dense_2
  LogSoftmax
]

However when we train it we use WeightedCategoryCrossEntropy, if you look up the description it says:

  • A batch of activation vectors. The components in a given vector should be pre-softmax activations (mappable to a probability distribution via softmax). For performance reasons, the softmax and cross-entropy computations are combined inside the layer.

Therefore I believe we shouldn’t be using LogSoftmax as the last layer because we already calculate softmax in WeightedCategoryCrossEntropy loss?

However, I think WeightedCategoryAccuracy used in EvalTask does require probability distribution, therefore I find current implementation rather confusing. Please advise

Hi @borowis

Good question :+1:
I believe you are correct and the LogSoftmax might be a remnant of some course updates (when CrossEntropyLoss was deprecated).

1 Like