Why: RuntimeError: CUDA error: out of memory

FEROMONTALVO · June 19, 2022, 5:26pm

I’m testing my development of cours 1-week 1 and after a lot of testing without errors, It apperars this erro in one part of code that in the past working well:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-219e6ba6ddf4> in <module>
     10 test_get_noise(1000, 100, 'cpu')
     11 if torch.cuda.is_available():
---> 12     test_get_noise(1000, 32, 'cuda')
     13 print("Success!")

<ipython-input-5-219e6ba6ddf4> in test_get_noise(n_samples, z_dim, device)
      1 # Verify the noise vector function
      2 def test_get_noise(n_samples, z_dim, device='cpu'):
----> 3     noise = get_noise(n_samples, z_dim, device)
      4 
      5     # Make sure a normal distribution was used

<ipython-input-4-198c4eb1d8d6> in get_noise(n_samples, z_dim, device)
     13     # argument to the function you use to generate the noise.
     14     #### START CODE HERE ####
---> 15     return  torch.randn(n_samples, z_dim, device = device)
     16     #### END CODE HERE ####

RuntimeError: CUDA error: out of memory

I reset my pc but I obtein the same result.
someone know whats it’s the problem?

paulinpaloalto · June 19, 2022, 10:00pm

That usually just means you got unlucky and the particular VM your notebook happens to be running on in the cloud was overloaded at that time. You can just try again. If it keeps repeating, try doing “Save” and then close and reopen the notebook and try again. If that still doesn’t work, just take a break and come back and try again in an hour.

FEROMONTALVO · June 20, 2022, 1:06pm

I don’t know what’s the problem but one day after i have the same problem.

paulinpaloalto · June 20, 2022, 3:47pm

It sounds like my guess about the cause was wrong. Are you sure that you don’t have an infinite loop in your code or that you haven’t done something else that causes the memory footprint of your notebook to grow to be extremely large (e.g. printing debugging output in a tight loop somewhere)?

You can figure out your memory size by downloading the notebook (“File → Download as notebook (ipynb)” and then looking at the file size. You can find the base size by doing “Kernel → Restart and Clear Output” and then downloading it again. In some of the GANs assignments, the training generates quite a bit of memory growth.

Saumya_Thakur · June 29, 2022, 6:39pm

Facing a similar error, tried to kill the processes shown in nvidia-smi and still obtaining the same ‘out of memory’ error. @FEROMONTALVO did you manage to resolve it?

OmidS · July 4, 2022, 1:40am

I am having the same issue. Running this cell locally returns “Success”. I think it must be related to the cloud system notebook is running.

sharob.sinha · July 4, 2022, 8:09pm

Hi @OmidS, can you please “Refresh the workspace” and re-submit the solution. (Please make sure to save your work before refreshing).

Thank you,
Sharob

OmidS · July 4, 2022, 8:27pm

Thanks. I will try that for the next assignment. I realized if the code is correct despite not being able to run this on the cloud (and I can verify this by running locally) the submission will go through and can be graded. Not the best workaround though.

Topic		Replies	Views
CUDA error: out of memory Build Basic Generative Adversarial Networks week-3	1	314	August 13, 2022
RuntimeError: CUDA error: out of memory Build Basic Generative Adversarial Networks week-2 , week-3	11	781	July 18, 2022
Staff Announcement: RuntimeError: CUDA error: out of memory FIX Generative Adversarial Networks (GANS)	0	118	May 3, 2022
Error Encountered When Training Neural Network AI Discussions ai-discussions	1	95	April 22, 2024
Week 1 Assignment: RuntimeError Build Basic Generative Adversarial Networks week-1	7	856	February 15, 2022

Why: RuntimeError: CUDA error: out of memory

Related topics