Input and output dimension of test model used in Assignment

Krunal_Gedia · June 16, 2022, 1:01am

Strangely, I am just confused by the following part of code used in the assignment:

## TEST CODE:

def base_model():
    inputs = tf.keras.layers.Input(shape=(2))
    x = tf.keras.layers.Dense(64, activation='relu')(inputs)
    outputs = tf.keras.layers.Dense(1, activation='sigmoid')(x)
    model = tf.keras.Model(inputs=inputs, outputs=outputs)
    return model

test_model = base_model()

test_image = tf.ones((2,2))
test_label = tf.ones((1,))

What I understand from the base_model, is that it required each input data to be 2D i.e. it has 2 features (batch_size, image_height,image_width). The “test_image” is a (2,2) matrix i.e. two data events each of value [1,1]. i.e batch_size=2, image_height=2, image_width=2. However, “test_label” is defined as a 1D vector with just one event with value 1 i.e. label=1. This is why I am confused. Shouldn’t there be 2 labels in “test_labels” corresponding to batch_size=2 of the inputs? I tied giving two labels, I get an error:

#if I do
test_label = tf.ones((2,1))
InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 1 1 [Op:SparseSoftmaxCrossEntropyWithLogits]

Could you please let me know as to where I am going wrong?

Many thanks

Wendy · June 20, 2022, 6:07pm

Good catch, @Krunal_Gedia!

This is a tricky one, but it boils down to the fact that SparseCategoricalCrossentropy() is intended specifically for the situation where there are two or more label classes, as you can see here: tf.keras.losses.SparseCategoricalCrossentropy | TensorFlow Core v2.9.1
But, the model used in the test code effectively sets the number of classes to 1 by using a “1” as the first parameter to the final Dense layer in base_model.

I did a little experimenting, and it looks like if you pass a y_pred of shape (n,1) it flips it to be shape (1,n). I’m not sure if this is a bug or a feature, but it’s possible it’s intentional. Since the function is specifically expecting 2 or more class values in each y_pred row, if they only get one, they could assume you just accidentally flipped it, and then they “correct” it for you. In any case, it’s an atypical behavior.

As a test, you can get more typical behavior if you change the “1” in the final Dense layer in base_model to a “2”. Then SparseCategoricalCrossentropy won’t flip the y_pred, and you can change test_label to tf.ones((2,)) to get a more expected behavior.

I’ll submit a ticket for the developers to ask them to update this test code to use 2 or more classes for the final layer so we don’t hit this weird anomaly in SparseCategoricalCrossentropy. It will be confusing for any future students who look at this as carefully as you did.

Topic		Replies	Views
C3W2 Assignment Labels Shape Natural Language Processing in TensorFlow week-module-2	7	36	March 10, 2025
C2W4 assignment - problems in create_model() Convolutional Neural Networks in TensorFlow week-module-4	6	460	August 3, 2023
C2W4 Programming Assignment Grader Output Convolutional Neural Networks in TensorFlow week-module-4	2	558	June 26, 2022
DLS 2 week 3 exercise 6 : Wrong total cost Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	454	June 25, 2023
Course 2, week 3 compute_total_loss Improving Deep Neural Networks: Hyperparameter tun week-module-3 , coursera-platform	3	265	January 18, 2024

Input and output dimension of test model used in Assignment

Related topics