I’ve been able to configure the API to use local documents for its answer responses. However, I noticed quite expensive. Are there more efficient ways to do this??
The cost is really driven by the size of document. You can have your own “ChatGPT” by an open source LLM, but it also demands your GPU (16GB).