Instruction tuning for Quiz geenration

I have been working on a project that involves using LLM with RAG for making a scalable quiz generation application. In my first attempt, I used llama 2-7b model without fine tuning. However, this lead to hallucinations, main one being the model did not follow the quiz generating procedure prompt. Now this time I am using instruction tuning for making the model better. My main question is that what dataset should the model be given such that it behaves better at generating questions. Couple of things to note:

  1. Cannot finetune on the course data, as I want to make it scalable such that it can be used for any field course. Hence, why i am using RAG.
  2. Need to generate both numerical and theory based questions.
  3. Need to know what structure will the dataset have.

A beginner at this stage, so any help appreciated.

Why not try improving your prompt? Be very detailed and explicit. List out all possible cases you want to tackle in your prompt with examples, Don’t worry if the prompt gets long. Try it and see if you can get a better behavior from the LLM, A good prompt + RAG you have implemented should give satisfactory result.

Also where did you get Data you are passing as contexts (RAG)