You cannot currently connect to a GPU due to usage limits in Colab

I can’t complete the final assignment of the course because I have run out of GPU time.

I haven’t done anything excessive; just run the labs and completed assignments for this course.

Any idea how long I have to wait before I get more GPU time?

I had to run this for 4.5 hours with no acceleration but lost everything because I got back too late and it timed out.

I went to bed and woke up and now, after 17 hours, I still have no GPU acceleration.

I’ve started another 4.5-hour run. Hopefully, I’ll catch this one before it times out.

I guess the moral of the story is don’t burn through the course too quickly because Google might revoke your GPU privileges. One of the warning signs seems to be that Google Colab starts asking you whether you are a robot.

EDIT: GPU access was restored during my second run at this. So I restarted it with GPU and completed the assignment.

To answer my original question: it took about 18 hours for my GPU privileges to come back.

2 Likes

how long does it take to train 80ish epochs on gpu?

I just ran it and was getting about 40 seconds per epoch.

1 Like

Love to know how many epochs it took to pass the grader. I, too, have been put in GPU jail and am hoping to get to completion without it. Right now it is taking about 4 minutes per epoch :angry:

I passed with 60 epochs.

1 Like

Appreciate the prompt reply. My non-GPU run is at 35 heading to 60. Hope the network is correct and I don’t have to rerun!

I feel your pain. Good luck!

@ai_curious

If you simulate a validation set with higher metrics, odds are good that you can stop training just about at the right time (could be below 60). Here’s the outline:

  1. At the end of each epoch, use the generator to create a fresh set of images using new noise of the same shape you trained the gan on.
  2. Use the discriminator to predict the confidence level of real image
  3. Suppose you have a threshold of 65% and want atleast 50% of the datapoints to be identified as real, stop training when this happens.
  4. For submission, pick the images that fooled the discriminator in descending order of discriminator prediction in step 3.
  5. You might want to save your weights to continue training.

It would be good to keep in mind that higher cutoffs could require more epochs. Hope this helps.

1 Like

Now that I have completed the Specialization I may go back and tinker with some of the code to do things more scientifically. Your point about collecting discriminator confidence is a good one - certainly a better approach than eyeballing it, which I realized after the first submission doesn’t work very well at all. Luckily the grader provided that information for me in the feedback, and i was able to meet the threshold and finish the course and Specialization within the 1 week trial period. In a commercial situation I would no doubt be more careful about resource utilization and implement an early stopping rule, but for these programming assignments was typically minimizing a different parameter. Cheers