Why and when to use RAG

NightWing · August 24, 2024, 3:28pm

I’m just getting started with large language models (LLMs) and have been learning how to use them to create various products. Recently, I was introduced to the concept of Retrieval-Augmented Generation (RAG).

My main question is when to use RAG and whether it’s even necessary. If I understand correctly, one of the reasons for using RAG is because the context window of an LLM is limited. However, if I’m creating a knowledge base chatbot with a limited amount of text that doesn’t exceed the context window, can’t I simply send the entire text to the LLM directly instead of using RAG?

lukmanaj · August 24, 2024, 7:43pm

You could consider using RAG if you think there’s a possibility of expanding the knowledge base.

NightWing · August 25, 2024, 3:10pm

@lukmanaj correct me If I am wrong but expanding the knowledge base means. I am providing LLM with the knowledge it is not been trained with right?? Like providing excess to my private documents.

Let’s say I am using LLM with 1M context window then Do i really have to use RAG ??

abdulshukur1vr · August 25, 2024, 3:24pm

It depends. For example if you have an archive of structured and un-structured data that you would use only for your use case (business, project etc.).

lukmanaj · August 25, 2024, 5:20pm

Your understanding is correct regarding why Retrieval-Augmented Generation (RAG) is used. RAG combines LLMs with retrieval systems to handle large or dynamic knowledge bases, especially when content exceeds the context window or requires continuous updates.

In your case, if your knowledge base is small and fits within the LLM’s context window (even with a 1M token limit), directly passing the full text might be more straightforward and sufficient. However, if you anticipate growth or need to maintain up-to-date knowledge (e.g., adding new documents), RAG can be advantageous for scalable and dynamic retrieval.

Even if the complete document is within the context window, there is still an advantage. Your system is more efficient as you don’t need to pass all the information during each query. It all depends on your choice.

Nevermnd · August 25, 2024, 9:06pm

@lukmanaj @NightWing correct me if I’m wrong, but I think another big use of RAG has been to try and tamp down on hallucinations ?

I.e. Rather than having to answer a specific query from the totality of the LLM’s ‘memory’ (aka model), it can pull the information from a few, much narrower source documents.

I could… be wrong though.

lukmanaj · August 25, 2024, 9:09pm

You’re absolutely correct! One of the significant advantages of using RAG is to reduce hallucinations in LLMs. By relying on a retrieval system to fetch relevant, narrower documents, the model can base its generation on factual, up-to-date information rather than pulling from potentially inaccurate or outdated general knowledge stored in the model. Thanks for bringing this up.

NightWing · August 26, 2024, 2:46am

@lukmanaj @Nevermnd thank you.

Can you tell me more about How I can make a robust knowledge based chatbot.
What are the standard practices ??

sankar.m74 · August 27, 2024, 1:03pm

Large Language Models (LLMs) are trained on vast amounts of data, which influences their ability to generate responses. When you ask general queries, the LLM provides answers based on its training data, which is generally accurate but limited to the knowledge available up to the model’s training cutoff. However, for specific queries related to proprietary information or recent developments, the LLM may not provide accurate or relevant responses due to the lack of up-to-date or specialized data.

In such cases, Retrieval-Augmented Generation (RAG) can be particularly useful. RAG enhances the LLM’s performance by retrieving relevant information from a specific dataset or external sources and integrating this information into the response. This approach allows the LLM to offer more accurate and contextually relevant answers for specialized or current topics.

Therefore, while RAG is not always necessary, it is highly recommended when dealing with queries that require specific, recent, or niche information that the LLM alone might not be able to address effectively.

Kashif_Ayaz · August 28, 2024, 2:40am

We are using RAG to build a enterprise chat bot for our sales executives. if they require high level information on any specific product, instead of reaching out to product experts they could simply ask the chat bot.

We cannot do the same with the context window

No scalability
We cannot afford any false information (hallucination)
We cannot keep adding products to the system.

NightWing · August 28, 2024, 3:30am

Can i use rag for text data or It can also be used for tabular data as well ?

Kashif_Ayaz · August 28, 2024, 4:11am

You can use it for tabular data as well.

Topic		Replies	Views
Make an LLM source new info without RAG? Is it possible? AI Discussions ai-discussions	7	813	April 7, 2024
Chatbot giving answer out of context I provided from pdf AI Discussions	3	138	December 26, 2023
RAG vs. Fine-Tuning: Which One Suits Your LLM? AI Discussions ai-discussions	2	267	February 19, 2025
Create a GPT with owned Data & RAG AI Discussions ai-discussions , project	6	416	July 7, 2024
To use or not to use RAG AI Discussions ai-discussions	4	121	April 4, 2025

Why and when to use RAG

Related topics