OOM when allocating tensor solved with smaller batch size locally

If you run “C3_W1_Lab_2_Transfer_Learning_CIFAR_10” on your local machine and get an “out-of-memory” runtime error, then reduce the batch size down from 64 to avoid this “OOM” error (I did not observe this error during Colab’s run even with the batch size 64):

this will take around 20 minutes to complete

EPOCHS = 4
history = model.fit(
train_X,
training_labels,
epochs=EPOCHS,
validation_data = (valid_X, validation_labels),
# batch_size=64, # OOM when allocating tensor with shape[64,256,14,14] and type float on
batch_size=8,
)

2 Likes

Hello James,

are you asking a question? or giving a solution to an issue you encountered. It is a bit unclear to what you want to state. This exercise is part of a ungraded lab. if you want to ask why there is no error with batch_size 64 then here we are checking the transfer learning with a preprocessed data, pre-trained weights with batch size 64 (which was trained with batch_size of 32), that is why there is no OOM error. Basically we are checking with the pertained weights and ResNet50 how different batch_size can train a model.

Regards
DP

I provide the solution to the issue that I encountered on my laptop when I run with batch size 64. I got an out-of-memory exception. The solution was to reduce the batch size during training. The results were effectively the same. That’s all. This is just a piece of useful information for anyone who may have the same issue.

By the way, Colab with GPU is slower than my laptop with GPU, that’s why I prefer my system.

By the way, Colab with GPU is slower than my laptop with GPU, that’s why I prefer my system. Yes that has been same for me too, and I use my laptop with GPU rather than Colab with GPU.

Regards
DP

thankyou for sharing