Why: RuntimeError: CUDA error: out of memory

I’m testing my development of cours 1-week 1 and after a lot of testing without errors, It apperars this erro in one part of code that in the past working well:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-219e6ba6ddf4> in <module>
     10 test_get_noise(1000, 100, 'cpu')
     11 if torch.cuda.is_available():
---> 12     test_get_noise(1000, 32, 'cuda')
     13 print("Success!")

<ipython-input-5-219e6ba6ddf4> in test_get_noise(n_samples, z_dim, device)
      1 # Verify the noise vector function
      2 def test_get_noise(n_samples, z_dim, device='cpu'):
----> 3     noise = get_noise(n_samples, z_dim, device)
      4 
      5     # Make sure a normal distribution was used

<ipython-input-4-198c4eb1d8d6> in get_noise(n_samples, z_dim, device)
     13     # argument to the function you use to generate the noise.
     14     #### START CODE HERE ####
---> 15     return  torch.randn(n_samples, z_dim, device = device)
     16     #### END CODE HERE ####

RuntimeError: CUDA error: out of memory

I reset my pc but I obtein the same result.
someone know whats it’s the problem?

That usually just means you got unlucky and the particular VM your notebook happens to be running on in the cloud was overloaded at that time. You can just try again. If it keeps repeating, try doing “Save” and then close and reopen the notebook and try again. If that still doesn’t work, just take a break and come back and try again in an hour.

I don’t know what’s the problem but one day after i have the same problem.

It sounds like my guess about the cause was wrong. Are you sure that you don’t have an infinite loop in your code or that you haven’t done something else that causes the memory footprint of your notebook to grow to be extremely large (e.g. printing debugging output in a tight loop somewhere)?

You can figure out your memory size by downloading the notebook (“File → Download as notebook (ipynb)” and then looking at the file size. You can find the base size by doing “Kernel → Restart and Clear Output” and then downloading it again. In some of the GANs assignments, the training generates quite a bit of memory growth.

Facing a similar error, tried to kill the processes shown in nvidia-smi and still obtaining the same ‘out of memory’ error. @FEROMONTALVO did you manage to resolve it?

I am having the same issue. Running this cell locally returns “Success”. I think it must be related to the cloud system notebook is running.

Hi @OmidS, can you please “Refresh the workspace” and re-submit the solution. (Please make sure to save your work before refreshing).

Thank you,
Sharob

Thanks. I will try that for the next assignment. I realized if the code is correct despite not being able to run this on the cloud (and I can verify this by running locally) the submission will go through and can be graded. Not the best workaround though.