Hey folks! In the assignment we build the following model:
Serial[
Embedding_9088_256
Mean
Dense_2
LogSoftmax
]
However when we train it we use WeightedCategoryCrossEntropy, if you look up the description it says:
- A batch of activation vectors. The components in a given vector should be pre-softmax activations (mappable to a probability distribution via softmax). For performance reasons, the softmax and cross-entropy computations are combined inside the layer.
Therefore I believe we shouldn’t be using LogSoftmax
as the last layer because we already calculate softmax in WeightedCategoryCrossEntropy loss?
However, I think WeightedCategoryAccuracy used in EvalTask does require probability distribution, therefore I find current implementation rather confusing. Please advise