So after watching these short courses on how you can effectively extract information from texts using LLMs and RAG, I had an idea that I wanted to do something for my company where I will build a database to store all our important documents, then build a chatbot that will be able to answer any questions my colleagues may ask it. I want to know if it is advisable since these documents are confidential and we don’t want the LLMs using it as a training set and potentially leaking it to the public.
Hey Nana. It’s good idea. I also want to implement an idea like this. Did you get an answer for this question? I am also curious…
Hello @A_Nandhini, you will basically replace the ChatOpenAI with a local LLM which you can download using Ollama.
So for embedding, I used the “nomic-embed-text”
“def get_embedding_function():
embeddings = OllamaEmbeddings(model=“nomic-embed-text”,show_progress=True)”
For query, I used phi3
“model = Ollama(model=“phi3”)”
All these models are local models you can download using Ollama.