RetrievalQA does not identify correct context from document

Hi! I greatly enjoyed this course and am excited to get started with building LLM apps. Could someone please offer some guidance?

I’m using the RetrievalQA chain with retriever that’s created based on a vectorstore index. I’m using the ChatOpenAI llm and OpenAIEmbeddings.

I am able to query the chain and obtain many accurate results. However, the model cannot answer some questions that are clearly answered within the documents. When I run the chain in LangChain debug mode, I can see that the chain is not identifying the correct context from the document.

  • What “knobs” can I use to help the RetrievalQA chain do a better job of identifying the correct context from the document (in my case, a PDF)?
    • I have tried several vectorstores.
    • I uploaded the same document to the tool ChatDOC. ChatDOC is able to identify the correct context (the relevant text is highlighted in their UI) and correctly answer the question.
  • Does anyone have tips for using RetrievalQA to answer questions based on a data that’s found within a table?
    • I am looking at a PDF that is a quarterly earnings report. I would like the model to read and understand the financial statements.
    • Note: This is helpful for the next steps of my project; the answer to the question that I’m currently asking using RetrievalQA is written in plain text on the first page of the document. I am not sure why the statement is not being identified as relevant context.