Help my lab runtime is crashing repeatedly

Help, my lab runtime is crashing repeatedly.
Has enough resources been allocated for this lab?

I am getting this message repeatedly

Are you choosing the right options for compute in the beginning?

Thank you @gent.spah ,

The instructor in the video did say not to worry about that; everything is already set up.
In the first lab, that was not a problem.

In the second lab, everything seemed okay the first time I ran the model. When I tried to run the code a second time, memory and space seemed to be critical. I try chose the default settings, and I tried to change the compute size to a bigger machine in one instance, but that did not go well either.

Were there specific options that I missed?
Also, why was the kernel not active and running?

Just out of curiosity, If I want to run this in Google Colab, can I download the LLMs from somewhere else without having to use the “aws cp” command?

Regards

Its my suggestion, I dont have info on the inner workings of the Labs, however i would guess that there is somekind of memory reset involved which takes place after sometime in the labs at aws, my suggestion is to try again after sometime at a maximum of 1 day. What do you think @esanina (I also mentoring this course btw).

Thanks, @gent.spah , I will wait a day and try again. In all disclosure, it ran the first time. I could not get it to run again without crashing.

One more question, I have access to Google Colab; if I want to run this in Google Colab, do you have any suggestions for removing the scripts that are specific to AWS? I tried to google the name of the language library, but I couldn’t find much.

@gent.spah , your suspicions were correct. I went into my personal AWS account.
I tried to use the notebook. The problem was very obvious. The settings, for some reason, were not sticking, and it took me some 30-40 mins to make the right env profile stay.

@ajeancharles the codes should be runnable multiple times, but please make sure that you use correct kernel and instance type every time. If the code is crashing for some other reason, please let me know which lab and what is your coursera username/email - we’ll try to find out what is going on.

Thank you, @esanina; I suspect the environment profile kept defaulting to a less powerful environment. I decided to run code in my own AWS session, and after a few trials (in reassigning the right instance), I got the notebook to stick to the appropriate profile, and the problem disappeared.

Many regards, and thank you for your consideration.

1 Like

@ajeancharles thank you for the message, I am glad to hear that. Happy Learning!

Always happy and learning, @esanina !
Regards!