Should we use chain of thoughts prompts while instruction tuning the model

When we are instruction tuning the model (data + instruction) for finetuning, would it make sense to give the prompt as a chain of thoughts prompt? Would that help the finetuned model perform better?

Hi @Paramdeep. Based on my knowledge, I don’t this this would be Instruction Tuning.

With instruction tuning you are giving the model many instructions of the same task, using the same prompt structure.

Perhaps we could argue that we could instruct-train the model with prompts that are COTs… I have never done that nor have I seen it, but I guess that’s an interesting idea.

Would you venture to try it and share your findings?

You’d have to come up with a good set of COT prompts and their completions, and then see if this improves your model in the target task.

1 Like

@Juan_Olano Yes. I meant using COTs to instruct-train the model. I was just wondering if there is any literature around this and if anyone has documented this.

I would love to do that and would share my findings with you.

Thank you for getting back on this. I do not have any reference of such style in Instruct Training. As shared, the way I would do it is:

  1. Clearly define my target task.
  2. Gather all the data that I can around this task
  3. Create prompts in the COT structure (as many as possible)
  4. Instruct-train my model with these prompts.
  5. Test.

These are basically the steps of any regular Instruct Training process. The only point where things may “differ” a bit from a traditional process is #3 where you would create the COT prompts as the content used for the training.

I truly look forward for your comments on this experiment!

1 Like

@Juan_Olano yes. I understand the process. I am trying to extract structured specifications from manufacturer-given product descriptions on e-commerce sites like Amazon. I will try using this strategy for extraction and see the results. Will also try to compare fine-tuning with simple prompts and see if this performs better.