Ctransformers also using CPU as well as GPU for a model that should fit in VRAM

ramz · March 2, 2024, 1:52pm

I did try the smallest model and I still noticed CPU use along GPU so it wasn’t a memory size issue.

As I wrote in another post:

when I tested with Langchain llama.cpp there was no additional CPU usage. So I’m ditching CTransformers and setting on llama.cpp for my local work.

If anyone reads this and understands any benefits that Langchain’s CTransformers offers over their llama.cpp (or anything better than llama.cpp) I’d love to know.

Topic		Replies	Views
Langchain Ctransformer GPU performance AI Discussions ai-discussions	1	442	March 2, 2024
Faster Inference LLM AI Discussions ai-discussions , llm	1	46	October 28, 2024
NeuralStyleTransfer on local machine Convolutional Neural Networks	2	532	June 24, 2021
Cool development for fast, local CPU driven LLM inference AI Discussions ai-discussions	6	450	April 5, 2024
General Question on CPU/GPU processing AI Discussions	7	73	December 10, 2023

Ctransformers also using CPU as well as GPU for a model that should fit in VRAM

Related topics