In Week 2, lecture “PEFT techniques 2: Soft prompts”, a chart is shown to illustrate how well soft prompting performs for large models.
The chart includes “Full fine-tuning” model and “Multi-task fine-tuning” model.
I have a question about the difference between these two. I thought that full fine-tuning really is multi-task. Here, does “Full fine-tuning” refer to single-task full fine-tuning?
If yes, is that the one discussed in earlier lectures about catastrophic forgetting?
What is meant by “Multi-task fine-tuning” in terms of the previously covered lectures? Is it full fine-tuning trained by multi-task instruction prompts? (like FLAN-T5 or FLAN-PALM)