Seeking advice on lightweight Language Models for offline application development

Hey community,

I’m working on a project to create an app based on language models that can be installed on both Windows PCs and MacOS. The goal is for it to run completely offline, without needing an internet connection.

Ideally, after installation, the app should run without requiring any extra dependencies. It’s meant to work on regular laptops from the past few years, so think 9th Gen Intel® CPUs or newer, 8GB of RAM or more, and no need for a dedicated GPU.

The app should be able to read a few files (up to 5) in formats like txt, pdf, ppt, and excel. It doesn’t need to have long conversations with users, generate images, or write code, but it should understand the content of the files it reads.

Given these requirements, I’m wondering if using smaller language models (with fewer parameters) could be a good option. If yes, what could be the best choices?

I’d really appreciate any insights you can share. Thanks in advance!

Hello @zhucebox,

take a look at the .gguf models; they can run on CPUs and typically require around 5-10 GB of storage. I’m not entirely sure about the RAM requirements, but 8 GB might be a bit tight. GGUF models are available for the entire LLaMA family, and I assume for other models as well. There is a trade-off in terms of accuracy. You can also fine-tune your own LLM and then export it in GGUF format.

I have not used the smaller models from other companies, so unfortunately I cannot suggest anything else.

Best