How does function fit work

model = Sequential(
[
Dense(25, activation = ‘relu’),
Dense(15, activation = ‘relu’),
Dense(4, activation = ‘softmax’) # < softmax activation here
]
)
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer=tf.keras.optimizers.Adam(0.001),
)

model.fit(
X_train,y_train,
epochs=10
)

Above is the code snippet in C2_W2_SoftMax lab. However, I could not understand how the fit function works. y has a shape of (2000, ), and each field is either 0, 1, 2 or 3, but the output layer have 4 units, and it seems that the function automatically mapped 0, 1, 2, 3 to the 4 units.

I tried changing y to the range of 1 to 4 (y += 1) and now fit reports an error. I am wondering how fit actually works.

Hello @yug030!

There are 2 implementations: tf.keras.losses.SparseCategoricalCrossentropy and tf.keras.losses.CategoricalCrossentropy.

The first one requires your label y for each sample to be the index of the classes which is therefore also required to start from 0.

The second one requires your label y for each sample to be an array with its length equal to the number of classes.

Documentation for the first one: