Ground Truthing LLMs

I’m building a Q&A bot on top of PDFs. The bot returned some correct and some clearly incorrect answers.

What is the best course, process or package to learn how to ground-truth my bot and improve accuracy?