Number of neurons in prediction layer

Hi,
I was experimenting with the number of neurons in the prediction layer for the week 2 ungraded lab notebook. Inline with exercise 4.

When using neurons that is less than the number of class in the fashion mnist data, the model is not able give a proper prediction, but when i pass in more neurons than the number of class in the data, the model’s is still able to accurately predict the class.

So i suppose my question why does it work even though i have more neurons, in the prediction layer, than the classes in the dataset? and would i ever get to a case where the model predicts a label that does not exist in the actual dataset?

import tensorflow as tf

# load the data
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

# normalize the image data
normalized_x_train = x_train / 255.0
normalized_x_test = x_test / 255.0

# define the model
exercise_4_model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                               tf.keras.layers.Dense(units=512, activation=tf.keras.activations.relu),
                                               tf.keras.layers.Dense(units=1024, activation=tf.keras.activations.relu),
                                               tf.keras.layers.Dense(units=30, activation=tf.keras.activations.sigmoid)]) # i have more neurons than i do classes in the dataset

# compile the model
exercise_4_model.compile(optimizer=tf.keras.optimizers.Adam(),
                         loss=tf.keras.losses.SparseCategoricalCrossentropy())
                         #metrics=[tf.keras.metrics.Accuracy()]) i purposefully commented this out to get past the error when trying to calculate accuracy

# train the model
history = exercise_4_model.fit(normalized_x_train, y_train, epochs=5)

# evaluate the model
results = exercise_4_model.evaluate(normalized_x_test, y_test)

print(f"trained model loss on the test data: {results:.2f}")

predicted_probability = exercise_4_model.predict(normalized_x_test)
predicted_class = np.argmax(predicted_probability[60])
print(f"predicted probability distribution: {predicted_probability[60]}")
print(f"predicted_class: {predicted_class}, actual class: {y_test[60]}")

There’ no harm in having more neurons than necessay for a classification problem. The downsides include waste of computational resources and ofcourse, confusing the reader on why number of output neurons don’t match the number of labels.

Once your model is trained, the model will not predict high values on labels it was not trained on.

Please read on the loss function, cross entropy

You can use as many neurons in the output layer as you want but at least they should equal the number of classes.