Tensorflow Training Out of Memory and "ValueError: Shapes (None, None, None, 7) and (None, None) are incompatible"

I’m working on a CNN for skin lesion classification. I’m using the ISIC2018 dataset, and my code for fitting the model is below:

classification.load_weights('classification.h5') #reset weights
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1)
classification.compile(optimizer=optimizer, loss="sparse_categorical_crossentropy", metrics=["binary_accuracy", 'MeanSquaredError', 'AUC']) 

callback_list = [tf.keras.callbacks.EarlyStopping(patience=2)]
batchsize=2500
spe = 50 #steps per epoch
epochs = 80
class_train = r"classi/ISIC2018_Task3_Training_Input/ISIC2018_Task3_Training_Input/" 
class_train_gt = pd.read_csv("classi/ISIC2018_Task3_Training_GroundTruth/ISIC2018_Task3_Training_GroundTruth/ISIC2018_Task3_Training_GroundTruth.csv")
organize_images_to_classes(class_train_gt, class_train) #A function I wrote which organizes the images to folders of their corresponding labels

for i in range(epochs+1):
    train_ds = datagen.flow_from_directory(directory=class_train, target_size=(256,256), batch_size=batchsize)
    print(len(train_ds))

    history = classification.fit(x=train_ds, batch_size=len(train_ds), callbacks=callback_list, steps_per_epoch=spe, verbose=1)

    print(f"--------------- Done epoch {i} -----------------")

classification.save_weights("final_class.h5")

Running this outputted:

Found 9104 images belonging to 7 classes.
4
2024-01-05 12:37:52.266460: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 1966080000 exceeds 10% of free system memory.

This is my code for creating my model:

def classi(input_shape):
    inputs = layers.Input(shape=input_shape)
    x = layers.Conv2D(64, 3, padding="same")(inputs)
    x = layers.Activation("relu")(x)
    x = layers.BatchNormalization()(x)
    #classi layers
    for filters in [128, 256, 512]:
        x = layers.Conv2D(filters, 3, padding="same")(x)
        x = layers.Activation("relu")(x)
        x = layers.BatchNormalization()(x)

        x = layers.Conv2D(filters, 3, padding="same")(x)
        x = layers.Activation("relu")(x)
        x = layers.BatchNormalization()(x)

        x = layers.MaxPool2D(3, strides=2, padding="same")(x)

    #output
    output = layers.Dense(7, activation=None)(x)

    model = k.Model(inputs=inputs, outputs=output, name="classification")
    return model

classification = classi((256,256,3))
classification.summary()

classification.save_weights("classification.h5")

However, after ~50 seconds of running the kernel, this error message: ValueError: Shapes (None, None, None, 7) and (None, None) are incompatible and a message was printed out saying the script exceeded the allocated memory (shown above).

I tried increasing the number of batches to 2500 (so each time I do .fit, the model is only trained on 4 images at a time) to minimize the amount of resources spent at one time, but the message kept popping up.

I am also uncertain of what the error message means.

Please help, thanks!

There’s a problem with the way the last Dense layer is emitting the output and hence the problem.

The correct (also the expected) output format is (batch size, num classes) which is (None, 7) in your case.

Please see this link to observe the usage of layers.Flatten() in the model architecture (one way to fix your code).

1 Like

That worked, thanks!