Fine Tuning LLM

I fine-tuned LLaMA 3.1 70B for my feedback-giver project using ORPO and SFT. After fine-tuning, when I give the model a prompt, it outputs the entire system message, prompt, and the response. My system message and input text together are about 1,300 tokens. Because of this, the model struggles to generate high-quality content, and the response time is very long. How can I adjust the model so that it only provides the output without repeating the input or system message?

can you show the codes with a screenshot showing what output you are getting but what output you are looking for?