Add custom data into 3.5-turbo

pavel_agurov · April 30, 2023, 9:03pm

Hi,

How to add custom data (ex. CSVLoader) into code with model 3.5-turbo, for example into ChatCompletion call?

Thanks,
Pavel

Refat · April 30, 2023, 9:13pm

Great topic, subscribing. I thought about the same problem and the thoughts are:

use the system role message to provide the data in the very first prompt in the text format explaining what is this data and how the model should use it;
process the completion output respectively and define the query parameters needed depending on the user’s input, e.g. if the user mentioned some product name and characteristics we can query the database based on this input and based on the response compose another prompt by providing this data as a part of the input.

pavel_agurov · May 1, 2023, 8:06am

Yes, I did exactly as you wrote, it’s more or less ok, but in this case I can’t:

check if no product in the database
if there are not enough about of it
if some product has no parameters
can’t provide choice of materials for example.
So, I can say - as a POC it works, but as production ready solution - not yet.

Refat · May 1, 2023, 8:26am

Then there is option 3: Finetuning.

Kindly share your experience here if You’d try that in your project.

P.S. However, you can only fine-tune InstructGPT (Ada, Davinci etc) models and GPT-3 models, not GPT-3.5 models, as stated in the official OpenAI documentation

Refat · May 1, 2023, 11:56am

Also, as option 4, it is possible to create embeddings vectors from custom data organized as multiple text documents. For searching the information for a user, the GPT API call retrieves the most relevant document using just these embeddings. Then with another prompt, GPT extracts the matching answer from the most relevant document.

pavel_agurov · May 1, 2023, 1:51pm

Finetuning is not a option for such task. It’s good for text, but not for concrete numbers. Numbers can not be probability, they must be equal. You can have answer “it seems we have iPhone on the stock”. The same issuew with converting concrete numbwes/dates/facts into embeddings and search by relevant.

For GPT-3 we can add customer data source, it works with bugs, but ok.

For 3.5 - I can’t find and asked.

Refat · May 2, 2023, 12:13pm

For GPT-3 we can add customer data source, it works with bugs, but ok.

Do you mean the data sources such as CSV, not text embeddings? Could you share more?

pavel_agurov · May 2, 2023, 7:36pm

Yes, exactly. Langchain allows to add it easy.

Topic		Replies	Views
How ( the easiest way) to fine tune a LLM model AI Discussions ai-discussions	0	96	January 15, 2025
Default langchain prompt templates LangChain: Chat with Your Data langchain	1	280	November 8, 2023
Rate Limit error with ChatGPT Prompt Engineering course ChatGPT Prompt Engineering for Developers feedback , introductions	0	23	September 11, 2024
Building your own chatbot AI Discussions ai-discussions	0	167	March 17, 2025
Chatbot Error LangChain: Chat with Your Data feedback , ai-discussions , langchain , project	2	32	November 8, 2024

Add custom data into 3.5-turbo

Related topics