Invalid output shape required by the grader?

Taranovski_Alex · July 15, 2023, 8:13am

Hi Everyone,

I’m trying to do the assignment for the course #2 week #4

I am quire confused about the setup.
the data shape requirements are:

Expected Output:

Training images has shape: (27455, 28, 28) and dtype: float64
Training labels has shape: (27455,) and dtype: float64
Validation images has shape: (7172, 28, 28) and dtype: float64
Validation labels has shape: (7172,) and dtype: float64

the output label seems to be defined as actual letter ‘a’ - ‘z’ (or the number 1-26)

I can’t use that as is for defining the network.
I tried to use the one hot encoding for 26 classes for training and validation properly, used that as the last layer in the network:
…
training_categorical_labels = to_categorical(training_labels, num_classes=26)
…
tf.keras.layers.Dense(26, activation=‘softmax’)
…
[snippet removed by mentor]

as a result, I get decent training/validation accuracy, but then in the grader I get:

Failed test case: your model could not be used for inference. Details shown in ‘got’ value below:.
Expected:
no exceptions,
but got:
in user code:

File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1366, in test_function  *
    return step_function(self, iterator)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1356, in step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1349, in run_step  **
    outputs = model.test_step(data)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1306, in test_step
    y, y_pred, sample_weight, regularization_losses=self.losses)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/compile_utils.py", line 201, in __call__
    loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/opt/conda/lib/python3.7/site-packages/keras/losses.py", line 141, in __call__
    losses = call_fn(y_true, y_pred)
File "/opt/conda/lib/python3.7/site-packages/keras/losses.py", line 245, in call  **
    return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/opt/conda/lib/python3.7/site-packages/keras/losses.py", line 1665, in categorical_crossentropy
    y_true, y_pred, from_logits=from_logits, axis=axis)
File "/opt/conda/lib/python3.7/site-packages/keras/backend.py", line 4994, in categorical_crossentropy
    target.shape.assert_is_compatible_with(output.shape)

ValueError: Shapes (None, 1) and (None, 26) are incompatible

is there some actual error in the setup, or should I use some custom/advanced code/layers?

balaji.ambresh · July 15, 2023, 9:10am

Number of neurons should equal the number of classes in the case of a multi-class classification problem.
As far as mentioning the loss function is concerned, if you explicitly one-hot encode the labels, then categorical cross entropy is correct. There exists a variation of the loss function which allows the true labels to be integers instead of their one-hot encoded versions.

Taranovski_Alex · July 15, 2023, 8:12pm

Basically my question is - in this assignment, do you expect a single integer value as an output of the network (1)?
Or do you expect a single array of ones and zeroes of size 26 (2)?
I did the second one, and the grader seem to expect the first one.

If you are suggesting that the output should be one integer number - then how does it relate to the earlier requirement of converting the data to float64?

Sergey_Demchenko · July 15, 2023, 8:59pm

There’s actually a trick with assignment for week 4 of C2 - it requires us to use the different image generator method (.flow instead of .flow_from…) - and this one requires additional preprocessing of labels that is in no way mentioned in the course or in the assignment.
You have to actually convert labels to arrays of 23 zeroes and 1 ones in order for the generator to pass them in the correct form using LabelBinarizer from sklearn package.
Probably because of that it doesn’t give 100% score on that assignment even though my model goes above 99% on training and 95% on validation datasets.

metasfer0us · July 15, 2023, 10:46pm

Balaji.ambresh is right. There absolutely is a loss function which enables model to use not array of 0’s and 1 but but single integer label

metasfer0us · July 15, 2023, 10:56pm

As far as I can tell. Requirements do specifically mention that output of model should be equal to number of categories.
float64 can preserve the number as an integer if conversion is done properly. My first attempt resulted in some multiples 10^(-312).

Sergey_Demchenko · July 16, 2023, 8:50am

Just to clarify - the shape of the labels that passes the grader requirements is easily achievable, but it’s NOT the shape of the labels that the model will accept (of course, the activation function is changed to the one required for multi-class labelling).
There’s an additional step that has to be applied to the labels BEFORE you pass them to generators. It’s a tricky one.

Taranovski_Alex · July 17, 2023, 3:35am

If that is all “tricky” and “not mentioned anywhere in the course” and “there exists a loss function that does this thing” - can I ask for more specific hints? Are there any examples I can find somewhere else? It is funny that we are learning neural networks, but there are no examples available

Taranovski_Alex · July 17, 2023, 4:19am

Thanks Everyone for the hints!

Sergey_Demchenko · July 17, 2023, 7:45pm

I suppose we can’t explicitly share the answers here, but actually my approach was not the best (as I just realized from the following course). There IS an activation function that works well with the desired output and it was not covered in the course. The idea is most probably to make learners do some digging in the documentation for themselves. It’s pretty easy though, just search for implementations of dense layers for multi-categorical labels.

Topic		Replies	Views
Multiclassification C2W4 Assignment - output layer definition Convolutional Neural Networks in TensorFlow week-4	3	360	October 22, 2023
C2T4: Grade explanation TF Developer Professional Certificate Resources	7	139	October 18, 2023
C2W4 assignment, training model Convolutional Neural Networks in TensorFlow week-4	7	361	October 30, 2023
Coursera Grader ValueError: Shapes (None, 1) and (None, 24) are incompatible Convolutional Neural Networks in TensorFlow week-4	9	657	August 15, 2023
C2W4 assignment error in model.fit Convolutional Neural Networks in TensorFlow week-4	19	738	July 11, 2023

Invalid output shape required by the grader?

Related topics