Different embedding models?

happyday · July 7, 2023, 11:59am

By default, all examples use openai’s embeddings. These cost us. Is the quality of their embeddings “better” for retrieval? If we use a different one and use an openai LLM, will the results be worse? Thank you.

Juan_Olano · July 7, 2023, 1:41pm

To create your embeddings database you can use any other embedding library. As long as you use the same embedding method to encode, search proximity, and decode, it will work.

So feel free to experiment with another embedding library, replace the one that comes in the labs, and see the results.

Lets say you are building a system that answers questions from a document.

The process is the same:

Take your original data
Create embeddings with it.

Then:
3. Take user input, for example a question.
4. Encode it using the same embedding library
5. Search matches (cosine similarity is very common)
6. Gather the results
7. Decode back to text

Then:
8. Build your prompt with the decoded text plus the user question plus any additional information
9. Call the LLM
10. Get results.

Please try it and share your findings!

NOTE: If you are going to create a solution and need to scale it, you may want to go for a quality platform. Pinecone is a very good one, for example.

Topic		Replies	Views
Saving and loading embbeddings to file? LangChain for LLM Application Development	0	111	August 10, 2023
Is OpenAIEmbeddings() being loaded? LangChain: Chat with Your Data	1	215	July 9, 2023
The best method for adding or removing new documents to embeddings LangChain: Chat with Your Data	1	150	July 11, 2023
How are different embeddings compatible? LangChain: Chat with Your Data	0	196	December 30, 2023
Embeddings, Vector DB, FAQs earch and ranking AI Discussions vector-database	4	165	June 7, 2024

Different embedding models?

Related topics