WeightedCategoryCrossEntropy in C3 W1

borowis · May 12, 2023, 6:39pm

Hey folks! In the assignment we build the following model:

Serial[
  Embedding_9088_256
  Mean
  Dense_2
  LogSoftmax
]

However when we train it we use WeightedCategoryCrossEntropy, if you look up the description it says:

A batch of activation vectors. The components in a given vector should be pre-softmax activations (mappable to a probability distribution via softmax). For performance reasons, the softmax and cross-entropy computations are combined inside the layer.

Therefore I believe we shouldn’t be using LogSoftmax as the last layer because we already calculate softmax in WeightedCategoryCrossEntropy loss?

However, I think WeightedCategoryAccuracy used in EvalTask does require probability distribution, therefore I find current implementation rather confusing. Please advise

arvyzukai · May 15, 2023, 10:26am

Hi @borowis

Good question
I believe you are correct and the LogSoftmax might be a remnant of some course updates (when CrossEntropyLoss was deprecated).

Topic		Replies	Views
CNN Week1 2nd Assignment - 'categorical_crossentropy' applied twice? Convolutional Neural Networks	3	642	April 13, 2023
C2_W2_SoftMax Lab - question about SparseCategorialCrossentropy or CategoricalCrossEntropy Advanced Learning Algorithms week-2	6	591	July 31, 2022
Numerical correct implementation of softmax Advanced Learning Algorithms week-2	6	613	December 24, 2022
C3W2 Assignment last layer/loss Selection Natural Language Processing in TensorFlow week-2	2	21	February 26, 2025
C3_W1_Assignment: LayerError: Exception passing through layer WeightedCategoryCrossEntropy NLP with Sequence Models week-1	2	434	June 19, 2023

WeightedCategoryCrossEntropy in C3 W1

Related topics