Llama3.2 from Huggingface in Google Colab

Hi,

I am trying to run Llama3.2 11b-Vision model from Huggingface in Google Colab. However, while downloading checkpoints, Colab goes out of memory. I have also tried the same in Kaggle but the issue persists. Is there a way (hopefully free :face_with_open_eyes_and_hand_over_mouth:) to achieve this?

Regards.

1 Like

@abrar39 I havenā€™t tried what you are attempting myself, but in Sharonā€™s excellent short course she discusses at one point ā€˜streamingā€™ the model rather than trying to load it all at once.

That might work.

@abrar39

If you are using kaggle, I suggest adding the model to your notebook directly on kaggle, instead of downloading from huggingface.

Check out this notebook:

In addition, if you are trying to finetune, unsloth is a good option. They have sample colab notebooks you can use. Though, I havenā€™t seen that of the 11b model. The 1b model has a colab notebook that you can run fast and is memory and beginner friendly.

2 Likes

Thank you for the response. I have used the 1b version in colab and it runs without problem. However, the generative ability of the 1b version is not very ā€œinterestingā€ to say the least. The kaggle option seems to be more valid. I shall use it.

1 Like

Thank you. I shall definitely follow the guidelines in the course.

Try with A100 gpu, with default gpu not works

1 Like

This works for me. Thank you very much. :fist_left:

1 Like