How to train RAG-models?

So, rag-data is external. Should I take this in accounting when I try to train a model? Should I try to retrieve potential data of RAG on the moment when prediction was done by human and pass it as simple prompt?

Should it be like mock-concept in testing for example?

Yes, It could indeed be similar to a mock-concept in testing, where you simulate real-world conditions to evaluate the model’s performance.