I am receiving error as the following when trying to train the model. May I ask for some help to figure out where I should look into? The assignment provided the hint " For `model.compile()`

the ground truth labels from the training set are passed to the model as **integers** (i.e. 0 or 1) as opposed to one-hot encoded vectors." What does this hint imply? I used “binary_crossentropy” as my loss function. Do I need to do something special to convert one-hot encoding to integer for my training labels? Thanks.

ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))

Hello there,

The value error means that there is a difference in shape between predictions and labels. The labels are binary and the predictions are vector of propabilities, you need to find the proper settings for the compile which can accomodate these two in comparison. Read through the losses to find the right one.

1 Like

Thanks for your reply. However, I don’t think the mismatched shape can be resolved by the choice of loss functions or set the model.compile().

our model outputs 2 neurons eventually. [x x] However, our ground truth label is an integer 0 or 1. No matter how I config the model.compile(), I will still get mismatched shape between prediction and groundtruth_label.

When exploring losses function documentation, I believe I should use BinaryCrossentropy. from_logits has to be False. the other parameter is reduction. However, reduction will reduce both groundtruth label and predicted label at the same time. I cannot only reduce dimension for the predicted label.

Please kindly let me know what I missed what I misunderstood. I really have no idea what to do right now.

Thanks.

2 Likes

You should definitely try your mind and find the solution to it. My answer above is still valid though, obviously cannot give you the solution but keep on trying.

1 Like

Thanks. I figured it out.

In the “C3_W4_Lab_1_FashionMNIST_CAM” I think is the correct loss function to use. It’s work for me.

1 Like

What confused me, is that the losses in `do_salience`

and `model.compile()`

are not the same