Adding prompt instruction concatenation as a part of the Kubeflow pipeline

In the data preparation step, we added an instruction to all questions in order to form the prompt for the model. My question is: Could this step be incorporated into the kubeflow pipeline used in the orchestration/automation step?

One advantage would be that the performance of different instructions could be evaluated, also that the file size for the training data would be reduced.

Can anyone think of any disadvantages? Let’s discuss!

Hi @carlesonielfa,

That’s actually the best practice. But as you can understand, for the purposes of teaching, it was done differently in the course.

Thank you for sharing that!

