Instruction Fine-tuning for Decoder-only Models

  1. Because instruction fine-tuning needs both an input and output sequence, my understanding is that it’s possible to do with encoder-decoder models (obviously) and with decoder-only models; but not with encoder-only models.

  2. LoRA and Prompt Tuning, on the other-hand, can be used for any type of model (encoder-decoder, encoder-only, decoder-only).

Is this correct?

Hi @Yacine_Mazari ,

Thank you for your interesting questions.

Lets tackle this first:

Instruction fine tuning is not necessarily exclusive for a seq-2-seq task. You can do instruction fine tuning for any task, including a classification task.

Now, lets go to the case you propose in your question 1:

In your question we are referring to the seq-2-se1 task. In general we can say that the best architecture for this task is a decoder or a full encoder-decoder. However, you can adapt an encoder-only model to do seq-2-seq by adding some layers in the output of the model. Is this ideal? hmmm probably not as you will get better results with one of the other two designs. But it can be doable.

Next up: Question 2:

Yes, LoRA and Prompt Tuning can be used in any model design, the same as instruct-tuning.