Ctransformers also using CPU as well as GPU for a model that should fit in VRAM

ramz · March 1, 2024, 9:16am

Hi everyone!

I’m hoping to share in the wisdom of the crowd.

I’m using ctransformers to load various GGUF models (7B Q4/Q5) that I have downloaded locally. I’ve got ctransformers[cuda] and the other Nvidia Windows GPU stuff that’s needed and when I run inference my GPU is definitely working.
The models should all fit into 8GB VRAM but when I run inference my CPU also spikes to 100%.
I have gpu_layers set to max and there is definitely a speed improvement when using GPU vs gpu_layers=0 (CPU only).

why is my CPU also being used?
btw if I use the Langchain GPT4All binding with device=‘gpu’ then it only uses my GPU, CPU doesn’t spike at all.

PS. it just occurred to me to download a tiny model to make sure there is ample VRAM available and see if CPU is still spiking just to remove any doubt about VRAM space.

Thanks in advance!

ramz · March 2, 2024, 1:52pm

I did try the smallest model and I still noticed CPU use along GPU so it wasn’t a memory size issue.

As I wrote in another post:

when I tested with Langchain llama.cpp there was no additional CPU usage. So I’m ditching CTransformers and setting on llama.cpp for my local work.

If anyone reads this and understands any benefits that Langchain’s CTransformers offers over their llama.cpp (or anything better than llama.cpp) I’d love to know.

Topic		Replies	Views
Langchain Ctransformer GPU performance AI Discussions ai-discussions	1	460	March 2, 2024
Faster Inference LLM AI Discussions ai-discussions , llm	1	65	October 28, 2024
Does model sharding fully utilize all GPUs? AI Discussions	0	170	June 28, 2023
Calculated loss must be on the original device: cuda:0 but device in use is cpu Finetuning Large Language Models	1	42	December 5, 2024
Tensorflow data load from cpu to gpu for inference taking time AI Discussions ai-discussions , ai-question	0	125	February 18, 2024

Ctransformers also using CPU as well as GPU for a model that should fit in VRAM

Related topics