Why doen't the retriever code "kick out" questions that are meaningless to the context of the index?

happyday · July 7, 2023, 5:10pm

I compared the embeddings of these sentences:

sentence1 = "lakjsdfljsdaflkjasdlkfj"
sentence2 = "I love to play outside in the summer."
sentence3 = "test"
sentence4 = "I love to eat ice cream in the summer."
sentence5 = "I hate to eat ice cream in the winter."

My scenario is I have this chatbot that 1) costs the most to query the “big brains” (LLM) 2) takes the most time as the LLM hems and haws on what it wants to answer and then who knows the quality of the network…)…

Why doesn’t the retriever figure out that the index can’t answer 1 and probably “test” is not in context? and provide that feedback before traversing the LLM abiss? The challenge I have found that entering these type of queries started getting weird answers and the LLM has woken up to chat.

I’ve included the heat map for the embedding values:

Topic		Replies	Views
QA - Is the prompt processed in the retriever module or in the QARetriever module? LangChain for LLM Application Development	0	74	June 24, 2023
RetrievalQA does not identify correct context from document LangChain for LLM Application Development	0	166	July 30, 2023
L4 - RetrievalQA seems unresponsive LangChain for LLM Application Development	5	303	August 27, 2025
Groundedness with Langchain Building and Evaluating Advanced RAG Applications	1	334	April 12, 2024
Langchain RetrievalQA Chain: Challenges with Context Adaptability LangChain for LLM Application Development langchain	0	209	August 10, 2023

Why doen't the retriever code "kick out" questions that are meaningless to the context of the index?

Related topics