Langchain Ctransformer GPU performance

ramz · March 2, 2024, 1:46pm

Finally managed to build llama-cpp-python with GPU support on Windows (absolute nightmare, similar problems reported by many people, different variations of suggestions, some work for some, other don’t, my own solution was never mentioned which is why it took me longer and I was exploring other python GPU options).

Langchain llama.cpp fully loads the model into GPU and executes it there. Fastest of the options I tried so far (but to be fair Langchain GPT4All not far behind) but most importantly doesn’t touch the CPU.

So I don’t know why CTransformers uses the CPU as well as the GPU. Well better said why ctransformers uses the CPU and GPU because I believe CTransformers use ctransformers for core functionality.

I can’t recommend Langchain CTransformers for GPUs over Langchain GPT4All or even better Langchain llama.cpp . Or you can use their native non-Langchain versions.

Topic		Replies	Views
Ctransformers also using CPU as well as GPU for a model that should fit in VRAM AI Discussions ai-discussions	1	507	March 2, 2024
How to run LLMs on GPU using LangChain? LangChain for LLM Application Development langchain	0	125	August 14, 2023
NeuralStyleTransfer on local machine Convolutional Neural Networks	2	533	June 24, 2021
YOLO transfer learning needs a high performance computer? Convolutional Neural Networks	6	550	July 19, 2022
Cool development for fast, local CPU driven LLM inference AI Discussions ai-discussions	6	525	April 5, 2024

Langchain Ctransformer GPU performance

Related topics