In the video of C1M4 “Evaluating your LLM” performance, it states that the response it fed to another LLM which output possible (user?) prompts that could have determined that response. In the video the response it’s talking about is said to be the RAG response: does it mean the augmented prompt before it’s fed to the LLM or the final response of the LLM?
Hey @Francesco_Boi, the metric the video is describing there is the Response Relevancy metric. It’s designed to evaluate the final output of the RAG system (the actual final output generated by the LLM) by using another LLM (LLM-as-a-judge) to determine how well that final output aligns with the user’s original prompt. If you’d like more info on that, I’d suggest checking out the RAGAS documentation. Does that help?