OOM when allocating tensor solved with smaller batch size locally

James_J_Johnson · July 17, 2023, 9:45pm

If you run “C3_W1_Lab_2_Transfer_Learning_CIFAR_10” on your local machine and get an “out-of-memory” runtime error, then reduce the batch size down from 64 to avoid this “OOM” error (I did not observe this error during Colab’s run even with the batch size 64):

this will take around 20 minutes to complete

EPOCHS = 4
history = model.fit(
train_X,
training_labels,
epochs=EPOCHS,
validation_data = (valid_X, validation_labels),
# batch_size=64, # OOM when allocating tensor with shape[64,256,14,14] and type float on
batch_size=8,
)

Deepti_Prasad · August 2, 2023, 5:34am

Hello James,

are you asking a question? or giving a solution to an issue you encountered. It is a bit unclear to what you want to state. This exercise is part of a ungraded lab. if you want to ask why there is no error with batch_size 64 then here we are checking the transfer learning with a preprocessed data, pre-trained weights with batch size 64 (which was trained with batch_size of 32), that is why there is no OOM error. Basically we are checking with the pertained weights and ResNet50 how different batch_size can train a model.

Regards
DP

James_J_Johnson · August 2, 2023, 5:53am

I provide the solution to the issue that I encountered on my laptop when I run with batch size 64. I got an out-of-memory exception. The solution was to reduce the batch size during training. The results were effectively the same. That’s all. This is just a piece of useful information for anyone who may have the same issue.

By the way, Colab with GPU is slower than my laptop with GPU, that’s why I prefer my system.

Deepti_Prasad · August 2, 2023, 6:19am

By the way, Colab with GPU is slower than my laptop with GPU, that’s why I prefer my system. Yes that has been same for me too, and I use my laptop with GPU rather than Colab with GPU.

Regards
DP

Adil_Faruq_Habibi · August 5, 2023, 3:10am

thankyou for sharing

Topic		Replies	Views
TensorFlow OOM Error Advanced Computer Vision with TensorFlow week-1	1	631	November 21, 2021
Getting OOM error Generative Deep Learning with TensorFlow week-3	4	622	April 29, 2022
Error Running TensorFlow Model AI Discussions data-centric	3	202	April 25, 2024
Can't finish training the model Convolutional Neural Networks in TensorFlow week-1	5	535	August 23, 2022
TF-AF Course 3 Week 3 Assignment Error 2 Advanced Computer Vision with TensorFlow	5	393	December 31, 2023

OOM when allocating tensor solved with smaller batch size locally

this will take around 20 minutes to complete

Related topics