Ignoring data privacy concerns does caching change the way that we should be thinking about RAG. Why should I build a RAG infrastructure if the model will cache my documents then support Q&A across them?
At some point, the cost will justify me making the investment.